Closed MattBlissett closed 9 months ago
The fragmenter finished:
INFO [07-12 14:56:21,019+0000] [pipelines_verbatim_fragmenter-3] org.gbif.pipelines.tasks.PipelinesCallback: Message handler ended - {"datasetUuid":"40e69efe-e113-4eea-9363-0a8eec54cf38","attempt":235,"pipelineSteps":["FRAGMENTER","HDFS_VIEW","INTERPRETED_TO_INDEX","DWCA_TO_VERBATIM","VERBATIM_TO_INTERPRETED"],"numberOfRecords":267628,"numberOfEventRecords":null,"runner":"DISTRIBUTED","repeatAttempt":true,"resetPrefix":null,"executionId":3040476,"endpointType":"DWC_ARCHIVE","validationResult":{"tripletValid":false,"occurrenceIdValid":true,"useExtendedRecordId":null,"numberOfRecords":267628,"numberOfEventRecords":null},"interpretTypes":["LOCATION","TEMPORAL","GRSCICOLL","MULTIMEDIA","BASIC","TAXONOMY","IMAGE","IDENTIFIER_ABSENT","AMPLIFICATION","CLUSTERING","OCCURRENCE","VERBATIM","MEASUREMENT_OR_FACT","LOCATION_FEATURE","AUDUBON","METADATA"],"datasetType":"OCCURRENCE"}
But the step in the pipelines history API still showed as Running.
I think the pipeline CLI tried to update using the webservice:
130.225.43.14 48880 crawler.gbif.org [12/Jul/2023:14:56:20 +0000] "POST http://api.gbif.org/v1/pipelines/history/execution/3040476/finished HTTP/1.1" 204 0 pass_uncacheable '-' 'Java/1.8.0_352'
So maybe the bug is in crawler-ws, or the pipeline should retry.
This affected 3 datasets at a very similar time.
Added ws retries to all client. Deployed to PROD, reopen if needed
Opened as #957
The fragmenter finished:
But the step in the pipelines history API still showed as Running.
I think the pipeline CLI tried to update using the webservice:
So maybe the bug is in crawler-ws, or the pipeline should retry.
This affected 3 datasets at a very similar time.