sisyphsu / dateparser

dateparser is a smart and high-performance date parser library, it supports hundreds of different formats, nearly all format that we may used. And this is also a showcase for "retree" algorithm.
MIT License
96 stars 24 forks source link

Locale support #5

Closed smndtrl closed 4 years ago

smndtrl commented 4 years ago

Currently local versions of dates like "23. März 1999" for 23nd of March 1999 in german aren't detected

sisyphsu commented 4 years ago

You could customize DateParser to detect it, like this:

DateParser parser = DateParser.newBuilder()
                .addRule("民国(\\d{3})年", (input, matcher, dt) -> {
                    int offset = matcher.start(1);
                    int i0 = input.charAt(offset) - '0';
                    int i1 = input.charAt(offset + 1) - '0';
                    int i2 = input.charAt(offset + 2) - '0';
                    dt.setYear(i0 * 100 + i1 * 10 + i2 + 1911);
                })
                .build();

Regional format are way too many, supporting them would be a nightmare.

ukitinu commented 3 years ago

Hi,
first and foremost I wish to thank you for your work, it really is a nice piece of work and it saved me a lot of time.

I wish to ask a question which I think is related to the one above. I'm not sure whether your answer solves my problem as I'm not really understanding it.

I would like to be able to parse months in more than just English, for example enero for January (Spanish) or luglio for July (Italian).
Very naively, I tried to simply add them to DateParserBuilder.months, thinking I could then edit DateParser::parseMonth, but it's not working: it recognises dicembre (same length of December) but not aprile (one character more).
How could I solve this problem? Is the rule above enough, and I'm simply not understanding it?

Thank you.