pacificclimate / pydap.handlers.pcic

A custom Pydap handler for PCIC's in-situ observational database
GNU General Public License v3.0
0 stars 0 forks source link

Fix SQL Error when downloading data from a station with no associated variables. #3

Closed corviday closed 7 years ago

corviday commented 7 years ago

Modifies RawPcicSqlHandler to provide a dummy query to run in cases where a station has no observations. Deals with PDP Issue 62.

Also includes a minor update to the Travis config file.

jameshiebert commented 7 years ago

Personally, I don't like the solution of creating a full "data" file with an error message substituted in place of where the observations would be. It would cause a lot of consternation for anyone trying to automate the process of downloading data from this server (e.g. the xls or nc file says that a certain field is supposed to be a number, but instead it is a string).

At present, I actually think that our current pdp version handles this appropriately. For example see this request for the station(s) at Kelp Reefs:

http://tools.pacificclimate.org/dataportal/data/pcds/agg/?from-date=YYYY/MM/DD&to-date=YYYY/MM/DD&input-polygon=MULTIPOLYGON(((-123.27008980969134+48.53294970073508,-123.22326377275759+48.516105681678845,-123.18108888991571+48.563130801486224,-123.24225889114777+48.57598888843893,-123.27008980969134+48.53294970073508)))&input-var=&network-name=&input-freq=&data-format=ascii&download-timeseries=Timeseries

It gives you the file, but just with blank data. I think that that's the right thing to do.

jameshiebert commented 7 years ago

So what's the difference between the blank station at Kelp Reefs that works and this test case that does not?

corviday commented 7 years ago

The test case that does not work belongs to a network (MVan) with no valid observations, for any station on the network, which means that the getStationVariableTable function in the database returns no variables associated with the station or network. This results in the query_one_station function in the database generating an invalid query for the database initialization file, like this:

crmp=> SELECT query_one_station(3299);
                                              query_one_station                                               
--------------------------------------------------------------------------------------------------------------
 SELECT obs_time from obs_raw WHERE (history_id = 4138) AND vars_id IN () GROUP BY obs_time ORDER BY obs_time
(1 row)

Replacing the initialization query with a syntactically valid (if trivial) query in cases where a station is not associated with any variables at all fixes the error.

corviday commented 7 years ago

I've never used git squash before. The result looks okay to me, but I'd appreciate you double-checking that everything came out correctly on that front.

jameshiebert commented 7 years ago

Looks good!

jameshiebert commented 7 years ago

OK, I merged this PR and bumped the version to 0.0.8 and published to our PyPI mirror. You should be able to bump the requirements in pdp to get the bugfix to be taken up in that repo.