Closed jayaddison closed 2 years ago
Sorry, we are not aware of any such dataset. We rely on a few dozen or so examples in our tests.
Ok, no problem - thanks anyway!
Noting some findings: one of the most widely-available data formats that includes ISO8601-format durations seems to be NetCDF.
And some open/public datasets published NetCDF datasets include:
(within those, some of the datasets are composed of fixed-duration data items (P1D
for example), so may not be suitable for testing/benchmarking - but datasets containing variable-duration items exist too)
Hi again - I'm sorry - I don't expect that you have begun or are evaluating timedelta-iso8601
- but if you are, please hold on as I may have violated a licensing policy with it, and will be yanking it from PyPi and making it private on GitHub until that question can be resolved. Apologies for the noise if you aren't evaluating, and for any frustration/complications if you are.
Continuing on with what is likely sending spam into the void here (not a criticism; I just want to provide an update for completeness' sake): to distance from potential concerns about repurposed method signatures and docstrings from cpython.git
in the timedelta-iso8601
library, and rather than attempting to smooth those over in-place on an existing work, I decided I wanted to re-implement the (pure-Python, no-regex) functionality from scratch, this time without opening or copying any code from cpython.git
to be on the safe side compliance-wise.
The result of that is timedelta-isoformat
on GitHub -- also available as wheel packages under the name timeformat-isoformat
on PyPi.
The license is AGPLv3 again, which I understand probably makes it unattractive for use in many situations, and I don't think or expect it'd be relevant for the Met Office to evaluate -- but I'd started the conversation about the original here, and want to complete that by mentioning the clean version (which I plan to performance optimize a bit further).
Hello - I've been dabbling with an AGPLv3-licensed implementation of ISO8601 duration parsing that subclasses the built-in Python
timedelta
object.As you're probably well aware from building
isodatetime
, Python's built-intimedelta
objects have some limitations, particular the lack of support for year-and-month fields in the constructor. Even so, I figured it was worth attempting an implementation; partly after learning about an open ticket for it in the Python bugtracker.I'm trying to focus on correctness and performance against ISO8601:2004 (although I'm not yet confident enough to declare support for that spec), and have created some test coverage and benchmarks -- but I'd like to build confidence against more representative datasets.
Do you know of any open/public-licensed and reasonably-sized (hundreds/thousands of item) sets of real-world-ish ISO8601 durations that I could test against?
Thanks either way! James