18F / open-data-maker

make it easy to turn a lot of potentially large csv files into easily accessible open data
Other
200 stars 135 forks source link

NR error: undefined method `each' for nil:NilClass in DataMagic.search #276

Open yozlet opened 8 years ago

yozlet commented 8 years ago

New Relic error log. Error triggered by loading this URL on staging: https://ccapi-staging.18f.gov/v1/schools/stats?school.degrees_awarded.predominant=2,3&_fields=2013.student.size&_metrics=avg,std_deviation,std_deviation_bounds

I think it's because the school.degrees_awarded.predominant field doesn't exist - so why isn't ErrorChecker trapping it?

Traceback:

NoMethodError: undefined method `each' for nil:NilClass
                       /home/vcap/app/lib/data_magic.rb:  142:in `search'
                      /home/vcap/app/app/controllers.rb:   69:in `search_and_respond'
                      /home/vcap/app/app/controllers.rb:   65:in `process_params'
                      /home/vcap/app/app/controllers.rb:   51:in `block (2 levels) in <top (required)>'
yozlet commented 8 years ago

Oops, my mistake. The problem with the above query is the _fields=2013.student.size part. It appears that we get this error with any year-nested fields. Aggregating on schools.* fields works fine.

I think the problem is happening in the query formulation. Here's the query generated for aggregating on school.tuition_revenue_per_fte:

{ :index=>"development-school-data", 
  :type=>"document", 
  :body=>{:from=>0, :size=>20, 
    :query=>{:match_all=>{}}, 
    :aggs=>{"school.tuition_revenue_per_fte"=>{
      :extended_stats=>{"field"=>"school.tuition_revenue_per_fte"}}}, 
    :fields=>["school.tuition_revenue_per_fte"], 
    :_source=>false}, 
  :search_type=>"count"
}

and here's the query for aggregating on 2013.cost.avg_net_price.overall:

{ :index=>"development-school-data",
  :type=>"document",
  :body=>{:from=>0, :size=>20, 
    :query=>{ :match_all=>{}}, 
    :fields=>["2013.cost.avg_net_price.overall"], 
    :_source=>false}, 
  :search_type=>"count"
}

The latter produces no aggregations element in the returned JSON, which triggers the above error. But when I formulate the query the right way in Sense (with the aggs sub-hash), it works fine.

@siruguri, any ideas? I can dive into this but it may take me a little while to get to.

siruguri commented 8 years ago

which branch has the code that I can check this on? master?

yozlet commented 8 years ago

Yep, this is in master now (as of last week) but otherwise try dev.

yozlet commented 8 years ago

Also, this bug only surfaces with nested data - look at the spec/fixtures/nested-files fixture.

siruguri commented 8 years ago

I think I'll have time early next week to look into this ... lmk if you get to it earlier. I'm guessing it's something to do with the check for types in lib/data_magic/query_builder.rb line 45. Maybe the nested fields cause the column_field_type to not be set to integer or float?

On Mon, Jan 25, 2016 at 6:19 PM, Yoz Grahame notifications@github.com wrote:

Also, this bug only surfaces with nested data - look at the spec/fixtures/nested-files and spec/fixtures/nested2 fixtures.

— Reply to this email directly or view it on GitHub https://github.com/18F/open-data-maker/issues/276#issuecomment-174779190 .

ultrasaurus commented 8 years ago

actually, I think in this case it is because the resulting values are all null so I guess the stats aren't calculated and result['aggregations'] is nil

yozlet commented 8 years ago

Reopening because this issue isn't fixed, though thanks to #286 it's no longer throwing exceptions; it just returns empty results. Still needs a failing test.