tech-angels / vandamme

A Changelog parser (NOT MAINTAINED ANY MORE)
http://tech-angels.github.com/vandamme/
MIT License
212 stars 21 forks source link

Parsing full English-style dates #10

Closed olivierlacan closed 9 years ago

olivierlacan commented 9 years ago

I know the formatting guidelines state that dates must be formatted with the ISO format but I'm curious if it would be feasible to parse "full" dates like: April 18th, 2014. The ordinal suffix is surely a pain to deal with but I find myself thinking it's much easier to parse than even the ISO format and it's clearly not as ambiguous as the localized short formats (04/18/14, 18/04/14).

Example: https://github.com/skylightio/skylight-ruby/blob/master/CHANGELOG.md

svetlyak40wt commented 9 years ago

Working on http://allmychanges.com I've encountered many date time formats and agree that this style is used quite often. If vandamme be a python program, I would recommend use great python-dateutil module. Sure there should be something similar for ruby.

gravis commented 9 years ago

Hi Folks. I have no issues parsing dates or times. Ruby is capable of parsing a lot of date formats. The issue is more, when running vandamme on 200K+ projects, you will notice a LOT of false positives (if the regexp is too permissive). People are getting very creative when it comes to create different formats in files, and while we can parse more date formats, the convention needs to stick to closer requirements. At Gemnasium, we're fighting every day with various implementations of semver, and none of the languages we support are implementing semver correctly (ie: we have versions like "3.0.0-pr.2+fa7fcdb"). I'll check what we can do to parse the date directly, but it will require to update the regexp, and this means a lot of tests (we have nothing automated yet to hit the full DB, and more important, to interpret results). Thanks

gravis commented 9 years ago

Hi @olivierlacan, Please let us know if it works for you. Cheers, Philippe

olivierlacan commented 9 years ago

@gravis @gonzoyumo This is great! Thanks for addressing this.

Using \W (non-word character) instead of \/ as the separator between the version number and the release date is a lot more flexible, don't you think?

http://rubular.com/r/fZaAm0yqTn

gravis commented 9 years ago

Good idea! Thanks for the suggestion :)