cioos-siooc / pyobistools

Python port of OBIS's obistools QC R package
BSD 3-Clause "New" or "Revised" License
3 stars 4 forks source link

Convert date parsing to use python-dateutil #7

Open jdpye opened 2 years ago

jdpye commented 2 years ago

recommended to use https://pypi.org/project/python-dateutil/ for date verification/parsing

jdpye commented 2 years ago

Will need to extend the eventdate check to use the duration notation i.e. "startDate / endDate"

MathewBiddle commented 2 years ago

@albenson-usgs How often do you see the duration notation in DwC files?

albenson-usgs commented 2 years ago

Sometimes? Not sure that's helpful. I know Diana LaScala-Gruenewald published a dataset relatively recently that used it. Want me to find it?

MathewBiddle commented 2 years ago

No need to go digging. We're evaluating the need for a date validator to be able to check that format. Seems like it's not used very frequently, so might not be worth the effort.

jdpye commented 2 years ago

I'm sure we can do full-date durations as a first pass. I'm mostly concerned with sanely handling the mad 'partial date' versions that Mat correctly identified as being OK with the 8601 standard.

YYYY-MM-DD HH:MM:SS / YYYY-MM-DD HH:MM:SS ✔️ YYYY-MM-DD/DD 😨

albenson-usgs commented 2 years ago

Well we have been advising people in the SMBD to use the first one and not the second. Checking the OBIS Manual (which seems to have gotten a facelift) I discovered this

Capture

which is news to me. I was going to say we should add something to the OBIS manual to advise people to use the first one but since it already says not to use it at all I guess that would end up being a longer conversation.

jdpye commented 2 years ago

That's definitely news to me as well. I have a couple use cases where durations make the most sense, was operating under the assumption that they'd be supported. I think we have that longer conversation but at least we don't have to worry too much about duration edge cases.

In the meantime I think the desired behaviour here would be to parse the duration but throw a warning saying 'OBIS may not accept this date format' for now. It's called pyobistools, but I think people might use this against other DwC files so we wouldn't proclaim anything beyond that.

jdpye commented 2 years ago

ISO 8601 durations are those time-delta like specifications of like 1H15M etc etc. It looks like we're all in the universe we believed we were in, and nothing is strange or wrong. We will implement this function as we planned to and be comfortable in our compatibility.

jdpye commented 2 years ago

(this was probably a bit my fault for using 'durations' as the term for 'time intervals' above.)

albenson-usgs commented 2 years ago

Aha! I did not click the link. Need to get this straight in my own mind. It should be called an interval I believe.