ec2u / data

EC2U Knowledge Hub
https://data.ec2u.eu
Apache License 2.0
2 stars 0 forks source link

The API is returning events with modified fields set in the future #42

Open hmaskat17 opened 1 year ago

hmaskat17 commented 1 year ago

We've encountered a problem while importing events from the API. When making a request using the modified query parameter, the API responds with an extensive list of events that have modified fields set in the future. As a result, we're currently receiving hundreds of events on a daily basis from the API response, which seems to be unintended behavior.

steps to reproduce

  1. Make a request to the API with the query parameter modified, e.g. https://data.ec2u.eu/events/?.offset=0&.limit=100&>modified=2023-08-21

expected behaviour

The response should consist of events filtered by the modified property. The events should have a valid modified date.

actual behaviour

The response includes events not properly filtered by the modified property. The events have invalid modified dates (set in the future).

would you share any observation or additional context about the bug?

Example event:

Looking For https://data.ec2u.eu/events/4b99f7476ec94873d285076fa35f72cc

The API returns the event with created and modified fields set in the future:

    "created": "2024-03-07T23:00:00.0Z",
    "modified": "2024-03-07T23:00:00.0Z",

@knoan

knoan commented 1 year ago

All affected events are published by https://sortir.grandpoitiers.fr/: will review the ingestion pipeline.

knoan commented 1 year ago

The issue is with the source RSS feed: the ingestion pipeline populates creation/modified timestamps on the basis of the RSS pubDate field, which in the case of future events seems to be automatically set the previous midnight.

No easy workaround unless we opt for ignoring the timestamps.

Unfortunately we don't have a direct contact with the site maintainers.