RADAR-base / radar-output-restructure

Reads avro files in HDFS and outputs json or csv per topic per user in local file system
Apache License 2.0
1 stars 0 forks source link

Azure storage blob upgrade causes failed jobs #554

Open Bdegraaf1234 opened 3 months ago

Bdegraaf1234 commented 3 months ago

After updating azure-storage-blob to 12.22.1 jobs start failing with the following error

2024-04-03 20:30:35 ERROR - Failed to run job restructure (Job:37)
java.lang.NullPointerException: Cannot invoke "com.azure.storage.blob.implementation.models.BlobItemPropertiesInternal.getLastModified()" because "this.internalProperties" is null
        at com.azure.storage.blob.models.BlobItemProperties.getLastModified(BlobItemProperties.java:81) ~[azure-storage-blob-12.22.1.jar:12.22.1]
        at org.radarbase.output.source.AzureSourceStorage$list$2.invokeSuspend(AzureSourceStorage.kt:37) ~[radar-output-restructure-2.3.3.jar:2.3.3]
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) ~[kotlin-stdlib-1.9.22.jar:1.9.22-release-704]
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:108) ~[kotlinx-coroutines-core-jvm-1.7.3.jar:?]
        at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:584) ~[kotlinx-coroutines-core-jvm-1.7.3.jar:?]
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:793) ~[kotlinx-coroutines-core-jvm-1.7.3.jar:?]
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:697) ~[kotlinx-coroutines-core-jvm-1.7.3.jar:?]
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:684) ~[kotlinx-coroutines-core-jvm-1.7.3.jar:?]
Bdegraaf1234 commented 3 months ago

Looks like the issue is caused by a change in how "prefix" items are handled (these are directories?)

https://github.com/Azure/azure-sdk-for-java/issues/38168

Where you used to get a null if you called getLastmodified() on a prefix item, you now get an error.

Bdegraaf1234 commented 3 months ago

We'll mirror the situation from the S3 storage as a fix:

https://github.com/RADAR-base/radar-output-restructure/blob/refs/heads/main/src/main/java/org/radarbase/output/source/S3SourceStorage.kt#L61

Bdegraaf1234 commented 3 months ago

Seems to only be an issue for v2 as directories are handled differently after that. merging into branch v2 in #555