sul-dlss / parse_date

ruby gem to parse dates out of strings, with the intention of normalizing date metadata strings for indexing, e.g. Solr
Other
2 stars 0 forks source link

.parse_range allows for earlier dates, such as '-2100 - -2000' #31

Closed ndushay closed 5 years ago

ndushay commented 5 years ago

I can't think of a good reason NOT to allow dates earlier than -999.

DLME using .parse_range for American Numismatic Society gets this error:

    Record: http://numismatics.org/collection/1913.91.1,"Clay Tablet, Mesopotamia, 2100 BC - 2000 BC. 1913.91.1",1913.91.1,,,,,,,Islamic,,,,,,,,,manuscript,Clay,,๐’Š๐’‚†๐’†ฌ๐’Œ“/ ๐’‚๐’ฎ๐’ƒผ๐’‹ซ/ ๐’ˆง๐’ฎ๐’ƒผ๐’‚ท๐’‚ท๐’‰ˆ/ ๐’ˆฌ๐’ˆพ๐’Š๐’Œ“๐’บ/ ๐’ˆฌ๐’Ž™๐’„ฐ๐’ˆ๐’‚†๐’†ฌ/๐’ˆ ๐’Š๐’€ญ๐’‹ง/ ๐’๐’ˆ๐’Šบ๐’„ฅ๐’ˆ—/ ๐’Šบ๐’€€๐’Šฎ๐’Š•๐’†•/ ๐’‚๐’ฃ๐’‰Œ๐’๐’‹ซ๐’‹—/๐’€๐’Š๐’€ญ๐’‹ข๐’/ ๐’€๐’„ฅ๐’€€๐’‡’๐’‰š๐’ˆจ๐’‹ง[๐’ˆฌ๐’•]๐’„ฐ,,Tablet,,,Mesopotamia,๐’€€๐’€€๐’†—๐’†ท/ ๐’Œ‰๐’…‡๐’ˆ ๐’‰Œ/ ๐’‹—๐’€๐’‹พ/ ๐’ˆซ๐’Œ†๐’ƒป๐’‡ด/ ๐’†ฌ๐’‰๐’น๐’‚†๐’†ฌ/ ๐’น๐’๐’Šบ๐’Œด/ ๐’† ๐’‡ฒ๐’Œ‹๐’‚†๐’/ ๐’ˆฌ๐’บ๐’Œ“๐’Œ๐’„ฐ๐’‰/๐’…Ž/ ๐’†•๐’€€{๐’‰}๐’‰/ ๐’† ๐’ˆ—๐’†ฌ๐’‚ต๐’‰Œ/๐’‹ซ/ ๐’Œ‘๐’Œ†๐’ƒถ๐’บ/ ๐’‹—๐’€๐’‹พ,,,-2100|-2000,http://numismatics.org/collectionimages/19001949/1913/1913.91.1.obv.width175.jpg,http://numismatics.org/collectionimages/19001949/1913/1913.91.1.rev.width175.jpg,2016-10-21T21:48:00Z

    Exception: ParseDate::Error: Unable to parse range from '-2100 - -2000': undefined method `>' for nil:NilClass
    /usr/local/bundle/gems/parse_date-0.3.1/lib/parse_date.rb:49:in `rescue in parse_range'
    Caused by
    NoMethodError: undefined method `>' for nil:NilClass
    /usr/local/bundle/gems/parse_date-0.3.1/lib/parse_date.rb:58:in `year_range_valid?'

[ERROR] Unable to parse range from '-2100 - -2000': undefined method `>' for nil:NilClass
2019-10-22T20:42:12+00:00 ERROR Unexpected error on record <record #1 (data/american-numismatic-society/islamic-department/data/numismatic_islam_department.csv #1), output_id:ans_1944-100-73346>
    while executing (to_field "cho_date_range_norm2" at traject_configs/numismatic_csv_config.rb:43)

    Record: http://numismatics.org/collection/1944.100.73346,"Clay Tablet, Mesopotamia, 2100 BC - 2000 BC. 1944.100.73346",1944.100.73346,,,,,,,Islamic,,,,,,,,,manuscript,Clay,,,,Tablet,,,Mesopotamia,,,,-2100|-2000,http://numismatics.org/collectionimages/19001949/1944/1944.100.73346.obv.width175.jpg,http://numismatics.org/collectionimages/19001949/1944/1944.100.73346.rev.width175.jpg,2016-10-21T21:57:27Z

    Exception: ParseDate::Error: Unable to parse range from '-2100 - -2000': undefined method `>' for nil:NilClass
    /usr/local/bundle/gems/parse_date-0.3.1/lib/parse_date.rb:49:in `rescue in parse_range'
    Caused by
    NoMethodError: undefined method `>' for nil:NilClass
    /usr/local/bundle/gems/parse_date-0.3.1/lib/parse_date.rb:58:in `year_range_valid?'
ndushay commented 5 years ago

Also, from openn/penn-0001: '502-504'

    Exception: ParseDate::Error: Unable to parse range from '502-504': undefined method `>' for nil:NilClass
    /usr/local/bundle/gems/parse_date-0.3.1/lib/parse_date.rb:49:in `rescue in parse_range'
    Caused by
    NoMethodError: undefined method `>' for nil:NilClass
    /usr/local/bundle/gems/parse_date-0.3.1/lib/parse_date.rb:58:in `year_range_valid?'

[ERROR] Unable to parse range from '502-504': undefined method `>' for nil:NilClass
/usr/local/bundle/gems/parse_date-0.3.1/lib/parse_date.rb:49:in `rescue in parse_range': Unable to parse range from '502-504': undefined method `>' for nil:NilClass (ParseDate::Error)
    from /usr/local/bundle/gems/parse_date-0.3.1/lib/parse_date.rb:42:in `parse_range'
    from /opt/traject/lib/macros/date_parsing.rb:34:in `block (2 levels) in parse_range'
    from /opt/traject/lib/macros/date_parsing.rb:33:in `each'
    from /opt/traject/lib/macros/date_parsing.rb:33:in `block in parse_range'
    from /usr/local/bundle/gems/traject-3.2.0/lib/traject/indexer/step.rb:138:in `block in execute'