Unidata / MetPy

MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data.
https://unidata.github.io/MetPy/
BSD 3-Clause "New" or "Revised" License
1.24k stars 413 forks source link

Should METAR parsing have more datetime business logic added? #2707

Open dopplershift opened 1 year ago

dopplershift commented 1 year ago

Discussed in https://github.com/Unidata/MetPy/discussions/2706

Originally posted by **akrherz** October 3, 2022 Whilst considering the PR for #2701, I realized that there are some other painful edge cases in MetPy + METAR parsing. Presently, MetPy requires the user to either: 1) Apriori know the year and month of the METAR / METAR file being passed to MetPy or 2) Let MetPy guess the year and month based on the calendar UTC date. The issue is that neither of those will work for the situation of a user processing a stream of observations from a system like NOAAPort / LDM IDD. They will have situations where the METAR day, WMO header date, and calendar date may not match. Should all this business logic / boilerplate be shunted to every user attempting to parse METARs with MetPy? In [python-metar](https://github.com/python-metar/python-metar) we do have [business logic](https://github.com/python-metar/python-metar/blob/main/metar/Metar.py#L571) that attempts to help the user out by using the current `utcnow()` value to more accurately guess a timestamp when no month/year is provided to the parsing library. Should a similar approach be added to MetPy?

(cc @akrherz)

dopplershift commented 1 year ago

So first, as identified in #1256, we should try to actually parse full METAR products, so that we can grab the correct date from the product header. Right now we have a kind of hack that just joins lines it thinks are from METARs (to handle wrapping) and discards everything else:

https://github.com/Unidata/MetPy/blob/ca26129408d70d43d90627702232a00c25d1fcaa/src/metpy/io/metar.py#L363-L377

After that, having better heuristics for guessing the appropriate date by combining "now" and what's in the data would be good. I'm not in love with python-metar's code--there has to be a better way using timedelta rather than that decision tree of manual wrapping? (I could very well be wrong, though.)