gbif / pipelines

Pipelines for data processing (GBIF and LivingAtlases)
Apache License 2.0
40 stars 28 forks source link

Step ending, but not being marked as completed #930

Closed MattBlissett closed 9 months ago

MattBlissett commented 1 year ago

The fragmenter finished:

INFO  [07-12 14:56:21,019+0000] [pipelines_verbatim_fragmenter-3]    org.gbif.pipelines.tasks.PipelinesCallback: Message handler ended - {"datasetUuid":"40e69efe-e113-4eea-9363-0a8eec54cf38","attempt":235,"pipelineSteps":["FRAGMENTER","HDFS_VIEW","INTERPRETED_TO_INDEX","DWCA_TO_VERBATIM","VERBATIM_TO_INTERPRETED"],"numberOfRecords":267628,"numberOfEventRecords":null,"runner":"DISTRIBUTED","repeatAttempt":true,"resetPrefix":null,"executionId":3040476,"endpointType":"DWC_ARCHIVE","validationResult":{"tripletValid":false,"occurrenceIdValid":true,"useExtendedRecordId":null,"numberOfRecords":267628,"numberOfEventRecords":null},"interpretTypes":["LOCATION","TEMPORAL","GRSCICOLL","MULTIMEDIA","BASIC","TAXONOMY","IMAGE","IDENTIFIER_ABSENT","AMPLIFICATION","CLUSTERING","OCCURRENCE","VERBATIM","MEASUREMENT_OR_FACT","LOCATION_FEATURE","AUDUBON","METADATA"],"datasetType":"OCCURRENCE"}

But the step in the pipelines history API still showed as Running.

I think the pipeline CLI tried to update using the webservice:

130.225.43.14 48880 crawler.gbif.org [12/Jul/2023:14:56:20 +0000] "POST http://api.gbif.org/v1/pipelines/history/execution/3040476/finished HTTP/1.1" 204 0 pass_uncacheable '-' 'Java/1.8.0_352'

So maybe the bug is in crawler-ws, or the pipeline should retry.

This affected 3 datasets at a very similar time.

muttcg commented 11 months ago

Added ws retries to all client. Deployed to PROD, reopen if needed

muttcg commented 9 months ago

Opened as #957