gbif / pipelines

Pipelines for data processing (GBIF and LivingAtlases)
Apache License 2.0
40 stars 28 forks source link

HDFS support missing in ALAVerbatimToEventPipeline at least #1037

Open vjrj opened 4 months ago

vjrj commented 4 months ago

Following #729 and #1004:

INFO  [2024-02-23 19:25:51,465+0100] [main] au.org.ala.pipelines.beam.ALAVerbatimToEventPipeline: Creating beam pipeline
Exception in thread "main" java.lang.IllegalArgumentException: No filesystem found for scheme hdfs
    at org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:495)
    at org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:565)
    at org.apache.beam.sdk.io.FileBasedSink.convertToFileResourceIfPossible(FileBasedSink.java:235)
    at org.apache.beam.sdk.io.AvroIO$Write.to(AvroIO.java:1683)
    at org.gbif.pipelines.transforms.Transform.write(Transform.java:139)
    at org.gbif.pipelines.transforms.Transform.write(Transform.java:150)
    at au.org.ala.pipelines.beam.ALAVerbatimToEventPipeline.run(ALAVerbatimToEventPipeline.java:182)
    at au.org.ala.pipelines.beam.ALAVerbatimToEventPipeline.run(ALAVerbatimToEventPipeline.java:98)
    at au.org.ala.pipelines.java.ALAVerbatimToInterpretedPipeline.run(ALAVerbatimToInterpretedPipeline.java:404)
    at au.org.ala.pipelines.java.ALAVerbatimToInterpretedPipeline.run(ALAVerbatimToInterpretedPipeline.java:148)
    at au.org.ala.pipelines.java.ALAVerbatimToInterpretedPipeline.run(ALAVerbatimToInterpretedPipeline.java:142)
    at au.org.ala.pipelines.java.ALAVerbatimToInterpretedPipeline.main(ALAVerbatimToInterpretedPipeline.java:133)
23-Feb 19:25:52 [LA-PIPELINES] [dr1050] [ERROR] Unexpected error during Interpretation dr1050 step
23-Feb 19:25:52 [LA-PIPELINES] [dr1050] [ERROR] Error 1 occurred on 1
Build step 'Execute shell' marked build as failure
New run name is '#982 - small-datasets-batch'