Closed chrisbartley closed 3 years ago
The latter (alternate download) is definitely easier, and is also what I'd suggest should be done by environmentaldata.org (and I'm 99% sure can be done totally client side). When a user requests a data export on environmentaldata.org, in addition to providing the new single link to download data from multiple feeds, environmentaldata.org could also simply call ESDR's feed metadata API, for example like this:
http://esdr.cmucreatelab.org/api/v1/feeds/?fields=id,name,latitude,longitude&whereOr=id=26,id=28
And get something like this:
{
"code": 200,
"status": "success",
"data": {
"totalCount": 2,
"rows": [
{
"id": 26,
"name": "Lawrenceville ACHD",
"latitude": 40.46542,
"longitude": -79.960757
},
{
"id": 28,
"name": "Liberty ACHD",
"latitude": 40.323768,
"longitude": -79.868062
}
],
"offset": 0,
"limit": 1000
}
}
And it could then present the above data to the user as CSV or JSON, as text-on-screen or as a click-this-button-to-copy-to-your-clipboard interface, or as a file download.
Your former suggestion, of including the data in the export file itself, it what I'm having a harder time with. Exporting of multiple feeds is live now, so we can work with real examples. Calling this
https://esdr.cmucreatelab.org/api/v1/feeds/export/26.OUT_T_DEGC,28.SO2_PPM?from=1609563600&to=1609576200&format=csv
Gets you this:
EpochTime,3.feed_26.OUT_T_DEGC,3.feed_28.SO2_PPM
1609565400,4.7,0.003
1609569000,6.7,0.004
1609572600,10.5,0.001
1609576200,10.1,0
So what I'm not understanding is where you want the name/lat/long to go? None of the following three options--the only ones I can think of--seem ideal to me. In order from best-to-worst:
Actual format doesn't matter to me at all. The point is that it's some sort of "commented-out" header that we either hope CSV parsers ignore, or that users need to manually tell their parser to ignore:
# Feed 26: Lawrenceville ACHD (40.46542, -79.960757)
# Feed 28: Liberty ACHD (40.323768, -79.868062)
EpochTime,3.feed_26.OUT_T_DEGC,3.feed_28.SO2_PPM
1609565400,4.7,0.003
1609569000,6.7,0.004
1609572600,10.5,0.001
1609576200,10.1,0
FeedId,FeedName,Latitude,Longitude
26,Lawrenceville ACHD,40.46542,-79.960757
28,Liberty ACHD,40.323768,-79.868062
EpochTime,3.feed_26.OUT_T_DEGC,3.feed_28.SO2_PPM
1609565400,4.7,0.003
1609569000,6.7,0.004
1609572600,10.5,0.001
1609576200,10.1,0
Maybe the values only actually appear on line 1 to reduce data size, but that's just an implementation detail. This one is worst partially because of file size bloat, but mostly due to the need for ESDR to parse and modify the datastore's response rather than just piping the output to the browser.
EpochTime,3.feed_26.OUT_T_DEGC,3.feed_28.SO2_PPM,26.name,26.lat,26.lng,28.name,28.lat,28.lng
1609565400,4.7,0.003,Lawrenceville ACHD,40.46542,-79.960757,Liberty ACHD,40.323768,-79.868062
1609569000,6.7,0.004,,,,,,
1609572600,10.5,0.001,,,,,,
1609576200,10.1,0,,,,,,
There are similar but different issues for JSON output. Regardless, none seems like a perfect solution, when environmentaldata.org could simply provide the info with no further changes to ESDR required.
All that said, I'm not saying no-always-and-forever, I'm just unsure how high to prioritize this when there are other, faster-to-production viable options. And it's not clear to me how users will typically use this. If they're regularly exporting the same(ish) set of feeds, then they only need the metadata once since feed ID, name, lat/long will never change.
Resolution: add a format
query string option to the feeds metadata API. Defaults to JSON, but will accept csv
(case insensitive) for CSV output of the metadata. Similar in spirit to the format
option for export.
Closing here, replacing with issue #61
When we export data from ESDR, especially in the new multi-export setup, it would be great to have a way to get corresponding lat,lon of each column without having to learn the ESDR api and writing code.
Could we for example insert lat, lon rows in the exported CSV, before the time series measurements? CSV is important since a lot of our users aren't coders, and they're using spreadsheets.
My vote would be to either insert lat, lon, name as three new rows in the data download CSV, or to have an alternate CSV download with the same columns headers, and just the three rows name, lat, lon. That latter thing might be easier from node, I'm not sure?