araddon / dateparse

GoLang Parse many date strings without knowing format in advance.
MIT License
2.03k stars 164 forks source link

Comprehensive validation πŸ”Ž, 30+ fixes integrated/added πŸ”¨πŸ›, optimized performance πŸš€ #159

Open klondikedragon opened 8 months ago

klondikedragon commented 8 months ago

This package is amazing and hugely popular, and has been the best package for automatic date parsing in go for years! ⭐

Thanks @araddon for crafting this package with love over the years!!

I've been using this while developing a new cloud-based log aggregation/search/visualization product, and I've found that there are three major opportunities for improvement for my particular use case:

This PR addresses all 3 opportunities:

In the process of going through the state machine comprehensively for validation, redundant code/states were merged, and support was added for certain edge cases (for example, some date formats did not support being followed by times).

The example and README.md were updated to incorporate all of the newly supported formats and edge cases. More details on how to properly interpret returned location information with respect to abbreviated timezones was added.

BREAKING -- the package now requires go >= 1.20 to support memory optimizations converting from []byte to string in key places.

A huge thanks to all who posted issues and contributed PRs -- while the PRs were unable to be merged directly because the validation changes were so major, the ideas of all these contributions and the associated test cases were incorporated. Here's credit for all of the issues fixes and contributions in this PR as well as a summary of additional fixes added:

Also adds tests to verify that the following stay fixed:

arran4 commented 8 months ago

Great work @klondikedragon

jmdacruz commented 8 months ago

this is great work @klondikedragon! Now, this repo hasn't seen much movement in years, do you think we should start using a fork? should we use yours?

arran4 commented 8 months ago

I would vote that we use his, we should see if it qualifies for https://github.com/avelino/awesome-go

klondikedragon commented 8 months ago

In some further testing, I found that weekday prefixes only worked for some date formats, but not for others. So that is fixed now. As a side effect (benefit?), leading whitespace is now allowed/ignored.

Let's see if @araddon has feedback and/or is interested in merging this PR (it's a pretty big change and changes the philosophy a bit to have validation, and also makes the code a little more complex in favor of performance). The changes are large enough now it could break backwards compatibility, so in the very least it should deserve a new major version IMO.

Although I don't want to fork something lightly, since we haven't heard any feedback from @araddon for a few years, it could definitely make sense to go ahead. If there is no comment after the holidays, I think it would make sense to go ahead and fork. This package is a key part of freeform date parsing in the "automatic structured field extraction" logic being built for IT Lightning (this new cloud-based log management platform I'm building). Given that's the case, the IT Lightning org would be willing to maintain the forked repo and work with community issues/contributions, since we're motivated to have best-in-class date recognition & parsing in the log ingestion pipeline. The license would remain the same of course.

The community contributions would help us improve our date parsing, we'd be motivated to put energy into it to keep our date parsing bug-free and comprehensive, and community use of the package might help us get a little exposure to devs/SREs who might become interested in our log management solution. So it should benefit everyone.

All feedback is welcome. What do ya'll think of this proposed plan?

elliot40404 commented 8 months ago

great work @klondikedragon . How can i start using this?

klondikedragon commented 8 months ago

I'll go ahead and fork this package. I'm renaming the main branch as part of that.

klondikedragon commented 8 months ago

The fork is complete and published as v0.1.0 -- again, a huge thanks to @araddon for authoring and maintaining this package for so many years!

The fork is available using go get github.com/itlightning/dateparse -- issues and PRs are welcome.

@elliot40404 @arran4 @jmdacruz -- see what you think and how this updated package works! If this looks good and after incorporating feedback, I think I'll publish a v1.0.0 at some point soon. I'm also curious to get feedback on my log management project too, check out the site/discord if you're interested. Thanks!