GeoDaCenter / spatial_access

https://spatial.uchicago.edu
37 stars 11 forks source link

Coverage doesn't use specified category field (edge case) #59

Open gyoliver opened 5 years ago

gyoliver commented 5 years ago

Low priority, but probably worth addressing eventually.

Scenario: an input destination file has a field named "category", but the user points to a different a field (e.g., "category_1") as containing category values. The user also provides a list of values to the "categories" parameter of the Coverage class constructor.

It looks like the code never maps the internal data model's "category" field to the one provided by the user. It looks for the category values in the list provided by the user in the destination file's "category" field instead of the field specified and can't find them.

The main scenario I could see this coming up in is where a user has a legacy field "category", and then a current field called "category_facility" or "category_1" etc containing the 'real' category values. I've seen this kind of thing pretty frequently given the poor data management practices out there, the difficulty of updating schemas...

gyoliver commented 5 years ago

A similar thing appears to be happening if there's a field called "population" that isn't used as the source for population values in the source/origin data. It looks like both fields' values get passed as a Series as the population for the source record.

The code gets tripped up in the function get_population_in_range BaseModel.py at the line "if source_population > 0:" .

Error message: Traceback (most recent call last): File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/flask/app.py", line 2309, in call return self.wsgi_app(environ, start_response) File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app response = self.handle_exception(e) File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception reraise(exc_type, exc_value, tb) File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise raise value File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app response = self.full_dispatch_request() File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request rv = self.handle_user_exception(e) File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception reraise(exc_type, exc_value, tb) File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise raise value File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request rv = self.dispatch_request() File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request return self.view_functionsrule.endpoint File "/Users/georgeyoliver/GitHub/CSDS/GeoDaCenter/spatial_access_web_app/routes.py", line 57, in index output_files = analyze(options) File "/Users/georgeyoliver/GitHub/CSDS/GeoDaCenter/spatial_access_web_app/routes.py", line 328, in analyze coverage_model.calculate(upper_threshold=options["maximum_travel_time"]) File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/spatial_access/Models.py", line 91, in calculate population_in_range = self.get_population_in_range(dest_id) File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/spatial_access/BaseModel.py", line 481, in get_population_in_range if source_population > 0: File "/Users/georgeyoliver/env/csds_web/lib/python3.6/site-packages/pandas/core/generic.py", line 1479, in nonzero .format(self.class.name)) ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().