Unidata / siphon

Siphon - A collection of Python utilities for retrieving atmospheric and oceanic data from remote sources, focusing on being able to retrieve data from Unidata data technologies, such as the THREDDS data server.
https://unidata.github.io/siphon
BSD 3-Clause "New" or "Revised" License
213 stars 75 forks source link

Support SPC reports #259

Open jrleeman opened 5 years ago

jrleeman commented 5 years ago

Would be nice to easily download SPC storm reports. I've done this nearly trivially with pandas, but the automatic URL generation from a datetime would be nice. Problems do occasionally arise with poorly formatted data - I think from commas in the report text (a CSV file itself). Should probably run a year's worth of reports to make sure it all works as planned.

jthielen commented 5 years ago

IEM has some endpoints for LSRs. Would those be useful for this, or should this just be directly accessing the CSV files from SPC?

jrleeman commented 5 years ago

Functionality for the LSRs would be interesting, but aren't those distinct from SPC? (I'm unsure here)

jthielen commented 5 years ago

I'm not sure either. @akrherz Would you know?

akrherz commented 5 years ago

I believe SPC's dataset has some QC done to the raw LSRs, could try pinging @pmarshwx here to see how much difference there.

pmarshwx commented 5 years ago

Sorry I missed this.

SPC does some high-level filtering on all our reports, including the raw reports. The high-level raw filtering removes duplicates and things of a similar nature. The filtered reports have additional QC, looking for reports within 5-miles and 15-minutes.

A note: sometime in the future (possibly 2019) we will begin a process to transition historical LSRs (which are preliminary) to the official, finalized reports.

akrherz commented 5 years ago

thanks @pmarshwx for chiming in. I do have a question tho as the official, finalized reports are Storm Data. Are you going to somehow translate the Storm Data into LSR format?

jrleeman commented 5 years ago

Thanks Patrick! We'll see what we can do about getting this into siphon after the new year.

pmarshwx commented 5 years ago

We would not recreate the LSR format. Rather we would update the CSV files to the official record.

We are still in the early stages of figuring out the best way to proceed.

akrherz commented 5 years ago

sorry @pmarshwx , I used poor wording. What I meant was that there isn't a one-to-one mapping of Storm Data attributes to those found in the LSR reports. Some of the CSV columns would have to be missing, right?

jrleeman commented 5 years ago

@pmarshwx - also I've run onto a few cases where report text had commas in it which screwed up parsing, just FYI

pmarshwx commented 5 years ago

@akrherz: We are not 100% sure what the end will look like. It's possible we just use a subset of new making an exact replica, or just make sure that the old is a subset of the new. It's all still in the talk phase.

@jrleeman: Those comma's cause me all sorts of problems. I have some code that starts with pandas and falls back to the native CSV module I could share.

jrleeman commented 5 years ago

That would be great @pmarshwx - here or via email would be fine.