adbar / htmldate

Fast and robust date extraction from web pages, with Python or on the command-line
https://htmldate.readthedocs.io
Apache License 2.0
117 stars 26 forks source link

Feature: Add Portuguese month names #99

Closed danielbichuetti closed 9 months ago

danielbichuetti commented 9 months ago

Hi @adbar,

Would be possible to add PT month names and abbreviations in the next release?

Thanks.

adbar commented 9 months ago

Hi @danielbichuetti, for now customized date extraction is not the priority as the heavy lifting should be done by the dateparser library. Is something not working for Portuguese?

danielbichuetti commented 9 months ago

Hello, @adbar. The date extraction throws a KeyError when it finds "dez" in the text.

adbar commented 9 months ago

Thanks for your feedback, I'm going to have a look at this.

adbar commented 9 months ago

Good catch, the fix will be in #100.