scrapinghub / dateparser

python parser for human readable dates
BSD 3-Clause "New" or "Revised" License
2.56k stars 465 forks source link

Incorrect date returned when input string contains only number and letter "S" #1133

Open jtalz opened 1 year ago

jtalz commented 1 year ago

When using the dateparser library with the following code:

from dateparser.search import search_dates

settings = {'REQUIRE_PARTS': ['month', 'year']}
parsed = search_dates('1324 S',
                      languages=['en'],
                      settings=settings)
print(parsed)

The output is:

[('1324 S', {todays date})]

which is incorrect as the input string does not contain a recognizable date. This issue occurs when the input string contains only a number followed by a blank space and the letter "S".

Steps to Reproduce:

  1. Run the code above with the input string '1324 S'
  2. Observe the output, which will be in the format [('1324 S', {todays date})]
  3. Try it using any number followed by ' S'

Expected Result: The dateparser library should return an empty list or an error, as the input string does not contain a recognizable date.

Actual Result: The dateparser library returns today's date, incorrectly treating the input string as a valid date.

Gallaecio commented 1 year ago

Could it be that S is interpreted the same as s, meaning a shorthand for “seconds”? I wonder if we should be case-sensitive here and do not interpret S the same as s for SI units.