gbif / pipelines

Pipelines for data processing (GBIF and LivingAtlases)
Apache License 2.0
40 stars 28 forks source link

TemporalInterpreter temporalRangeParser.parse exceptions #1011

Closed adam-collins closed 4 months ago

adam-collins commented 5 months ago

Running the TemporalInterpreter::interpretTemporal is throwing new exceptions for invalid eventDate ranges. e.g.

I think the old behaviour was to set the parsed eventDate to the first part of the invalid date range, e.g. 1921-12-15T21:31:00/1921-12-15T01:31:00 sets the eventDate to 1921-12-16 example and code

adam-collins commented 5 months ago

Currently testing with branch https://github.com/gbif/pipelines/tree/ala-2024

MattBlissett commented 5 months ago

I've handled all the exceptions found in GBIF's data, and wrapped the parser to catch any possible other exceptions and return a general INTERPRETATION_ERROR issue.

adam-collins commented 5 months ago

I double checked. The parser is throwing exceptions for some data.

  1. The first example I gave is calling new IsoDateInterval(from, to) without testing from < to and without catching the exception thrown.

Can be reproduced with:

    String date = "1922-11-02T21:01/1922-11-02T01:01";
    Map<String, String> map = new HashMap<>();
    map.put(DwcTerm.eventDate.qualifiedName(), date);
    map.put(DwcTerm.year.qualifiedName(), "1922");
    map.put(DwcTerm.month.qualifiedName(), "11");
    map.put(DwcTerm.day.qualifiedName(), "2");
    ExtendedRecord er = ExtendedRecord.newBuilder().setId("1").setCoreTerms(map).build();

    TemporalRecord tr = TemporalRecord.newBuilder().setId("1").build();

    TemporalInterpreter temporalInterpreter =
            TemporalInterpreter.builder().explicitRangeEnd(false).create();

    temporalInterpreter.interpretTemporal(er, tr);
  1. The second example exception is due to LocalDateTime attempting to parse only a date, without a time. This is the point of failure

Reproduce with:

    String date = "1922-11-02T21:01/1922-11-02";
    Map<String, String> map = new HashMap<>();
    map.put(DwcTerm.eventDate.qualifiedName(), date);
    ExtendedRecord er = ExtendedRecord.newBuilder().setId("1").setCoreTerms(map).build();

    TemporalRecord tr = TemporalRecord.newBuilder().setId("1").build();

    TemporalInterpreter temporalInterpreter =
            TemporalInterpreter.builder().explicitRangeEnd(false).create();

    temporalInterpreter.interpretTemporal(er, tr);
adam-collins commented 4 months ago

It appears I misunderstood a reply. Inclusion of a link to a pull request or commit, or a pull request or commit referencing this issue would have helped me.