Closed BioinfoTongLI closed 1 year ago
writing directly to s3 leads to a super long error log, which does happen when saving locally. Here's the end of the error log
2022-05-26 07:40:08,940 [pool-1-thread-1] ERROR c.g.bioformats2raw.Converter - Failure processing chunk; resolution=0 plane=1 xx=16384 yy=16384 zz=0 width=304 height=736 depth=1
java.lang.NullPointerException: null
at com.upplication.s3fs.S3AccessControlList.hasPermission(S3AccessControlList.java:39)
at com.upplication.s3fs.S3AccessControlList.checkAccess(S3AccessControlList.java:50)
at com.upplication.s3fs.S3FileSystemProvider.checkAccess(S3FileSystemProvider.java:470)
at java.nio.file.Files.isAccessible(Files.java:2455)
at java.nio.file.Files.isReadable(Files.java:2490)
at com.bc.zarr.storage.FileSystemStore.getInputStream(FileSystemStore.java:61)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:103)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:96)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:92)
at com.glencoesoftware.bioformats2raw.Converter.processChunk(Converter.java:1039)
at com.glencoesoftware.bioformats2raw.Converter.lambda$saveResolutions$4(Converter.java:1286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2022-05-26 07:40:09,207 [pool-1-thread-1] ERROR c.g.bioformats2raw.Converter - Failure processing chunk; resolution=0 plane=2 xx=16384 yy=16384 zz=0 width=304 height=736 depth=1
java.lang.NullPointerException: null
at com.upplication.s3fs.S3AccessControlList.hasPermission(S3AccessControlList.java:39)
at com.upplication.s3fs.S3AccessControlList.checkAccess(S3AccessControlList.java:50)
at com.upplication.s3fs.S3FileSystemProvider.checkAccess(S3FileSystemProvider.java:470)
at java.nio.file.Files.isAccessible(Files.java:2455)
at java.nio.file.Files.isReadable(Files.java:2490)
at com.bc.zarr.storage.FileSystemStore.getInputStream(FileSystemStore.java:61)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:103)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:96)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:92)
at com.glencoesoftware.bioformats2raw.Converter.processChunk(Converter.java:1039)
at com.glencoesoftware.bioformats2raw.Converter.lambda$saveResolutions$4(Converter.java:1286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2022-05-26 07:40:09,207 [main] ERROR c.g.bioformats2raw.Converter - Error while writing series 0
java.util.concurrent.CompletionException: java.lang.NullPointerException
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at java.util.concurrent.CompletableFuture.biRelay(CompletableFuture.java:1298)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1321)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.allOf(CompletableFuture.java:2238)
at com.glencoesoftware.bioformats2raw.Converter.saveResolutions(Converter.java:1314)
at com.glencoesoftware.bioformats2raw.Converter.write(Converter.java:691)
at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:646)
at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:477)
at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:92)
at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
at picocli.CommandLine.call(CommandLine.java:2761)
at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:1808)
Caused by: java.lang.NullPointerException: null
at com.upplication.s3fs.S3AccessControlList.hasPermission(S3AccessControlList.java:39)
at com.upplication.s3fs.S3AccessControlList.checkAccess(S3AccessControlList.java:50)
at com.upplication.s3fs.S3FileSystemProvider.checkAccess(S3FileSystemProvider.java:470)
at java.nio.file.Files.isAccessible(Files.java:2455)
at java.nio.file.Files.isReadable(Files.java:2490)
at com.bc.zarr.storage.FileSystemStore.getInputStream(FileSystemStore.java:61)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:103)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:96)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:92)
at com.glencoesoftware.bioformats2raw.Converter.processChunk(Converter.java:1039)
at com.glencoesoftware.bioformats2raw.Converter.lambda$saveResolutions$4(Converter.java:1286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@6b67034): java.lang.NullPointerException
at picocli.CommandLine.executeUserObject(CommandLine.java:1962)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
at picocli.CommandLine.call(CommandLine.java:2761)
at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:1808)
Caused by: java.lang.NullPointerException
at com.upplication.s3fs.S3AccessControlList.hasPermission(S3AccessControlList.java:39)
at com.upplication.s3fs.S3AccessControlList.checkAccess(S3AccessControlList.java:50)
at com.upplication.s3fs.S3FileSystemProvider.checkAccess(S3FileSystemProvider.java:470)
at java.nio.file.Files.isAccessible(Files.java:2455)
at java.nio.file.Files.isReadable(Files.java:2490)
at com.bc.zarr.storage.FileSystemStore.getInputStream(FileSystemStore.java:61)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:103)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:96)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:92)
at com.glencoesoftware.bioformats2raw.Converter.processChunk(Converter.java:1039)
at com.glencoesoftware.bioformats2raw.Converter.lambda$saveResolutions$4(Converter.java:1286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
@BioinfoTongLI : can you include the command you used? (i.e. is --entrypoint-url involved?)
Not using entrypoint-url
. The $image
is a local image. And the conversion was correct when not using s3
writing
/opt/bioformats2raw/bin/bioformats2raw --output-options s3fs_path_style_access=true ${image} s3://${accessKey}:${secretKey}@webatlas.cog.sanger.ac.uk/deleteme/
If you were accessing this via aws
, I think this would be:
aws --entrypoint-url https://cog.sanger.ac.uk s3://webatlas/...
with webatlas being the bucket. Adding the bucket to the front of the entrypoint ("webatlas.cog.sanger.ac.uk") is virtual-hosted-style as opposed to path-style:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#path-style-access
So perhaps try setting s3fs_path_style_access=false
(or just omitting it).
Still seeing the same null pointer
error.
2022-05-27 08:36:54,083 [main] ERROR c.g.bioformats2raw.Converter - Error while writing series 0
java.util.concurrent.CompletionException: java.lang.NullPointerException
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at java.util.concurrent.CompletableFuture.biRelay(CompletableFuture.java:1298)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1321)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1317)
at java.util.concurrent.CompletableFuture.allOf(CompletableFuture.java:2238)
at com.glencoesoftware.bioformats2raw.Converter.saveResolutions(Converter.java:1314)
at com.glencoesoftware.bioformats2raw.Converter.write(Converter.java:691)
at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:646)
at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:477)
at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:92)
at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
at picocli.CommandLine.call(CommandLine.java:2761)
at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:1808)
Caused by: java.lang.NullPointerException: null
at com.upplication.s3fs.S3AccessControlList.hasPermission(S3AccessControlList.java:39)
at com.upplication.s3fs.S3AccessControlList.checkAccess(S3AccessControlList.java:50)
at com.upplication.s3fs.S3FileSystemProvider.checkAccess(S3FileSystemProvider.java:470)
at java.nio.file.Files.isAccessible(Files.java:2455)
at java.nio.file.Files.isReadable(Files.java:2490)
at com.bc.zarr.storage.FileSystemStore.getInputStream(FileSystemStore.java:61)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:103)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:96)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:92)
at com.glencoesoftware.bioformats2raw.Converter.processChunk(Converter.java:1039)
at com.glencoesoftware.bioformats2raw.Converter.lambda$saveResolutions$4(Converter.java:1286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Command error:
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp921420814146520776/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@6b67034): java.lang.NullPointerException
at picocli.CommandLine.executeUserObject(CommandLine.java:1962)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
at picocli.CommandLine.call(CommandLine.java:2761)
at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:1808)
Caused by: java.lang.NullPointerException
at com.upplication.s3fs.S3AccessControlList.hasPermission(S3AccessControlList.java:39)
at com.upplication.s3fs.S3AccessControlList.checkAccess(S3AccessControlList.java:50)
at com.upplication.s3fs.S3FileSystemProvider.checkAccess(S3FileSystemProvider.java:470)
at java.nio.file.Files.isAccessible(Files.java:2455)
at java.nio.file.Files.isReadable(Files.java:2490)
at com.bc.zarr.storage.FileSystemStore.getInputStream(FileSystemStore.java:61)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:103)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:96)
at com.bc.zarr.ZarrArray.open(ZarrArray.java:92)
at com.glencoesoftware.bioformats2raw.Converter.processChunk(Converter.java:1039)
at com.glencoesoftware.bioformats2raw.Converter.lambda$saveResolutions$4(Converter.java:1286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Afaik, this s3
is not on aws
at all. Instead, we used CEPH (https://www.redhat.com/en/technologies/storage/ceph). But should be similar to what EBI is using s3.embassy.ebi.ac.uk/idr-upload
...
@prete any ideas?
I assume then that we will need to start using our own s3filesystem. See https://imagesc.zulipchat.com/#narrow/stream/212929-general/topic/ome-zarr.20basics.3A.20writing.20.20to.20s3/near/281819192 for a related conversation. Would be good to know how much time/space uploading directly will get you to know how important it is to prioritize this.
I see - it is currently not the most urgent task. nextflow can do the push and works fine with our CEPH storage. The cost is that we need to duplicate the data before the push. Tho, this might be an issue when it comes to the real atlas dataset (100 + whole embryo images). Let's priotize this in the next milestone.
Afaik, this s3 is not on aws at all. Instead, we used CEPH (https://www.redhat.com/en/technologies/storage/ceph). But should be similar to what EBI is using s3.embassy.ebi.ac.uk/idr-upload...
Indeed it's Ceph's rados gateway. Note: aws
that Josh used is the awscli tool that can also talk to S3 compatible storages (like our "Sanger S3"). Think of it as a s3cmd
alternative for you.
Uploading from bioformats2raw should work like this for you:
bioformats2raw \
--output-options "s3fs_access_key=${accessKey}|s3fs_secret_key=${secretKey}|s3fs_path_style_access=true" \
${image} \
s3://cog.sanger.ac.uk/webatlas/deleteme/
Keep in mind that uploading straight to S3 will slow down the process, because uploading is slower than disk I/O. But, like you said, won't duplicate the data and you won't have to copy it afterwards... so it's up to you!
Thanks @prete! Interestingly your syntax works....
I am pretty sure the original authentification version works as well, since I do have files created by bioformats2raw
. But it seems that passing them through output-options
is the right way.
@joshmoore worth an issue to glencoe?
My best guess is that s3://cog.sanger.ac.uk/webatlas
vs. s3://webatlas.cog.sanger.ac.uk/
. All this comes down to the fact that the S3 "standard" is a far cry from POSIX. You can open an issue on bioformats2raw but this is more a question of the underlying FileSystem implementation -- https://github.com/lasersonlab/Amazon-S3-FileSystem-NIO2 -- and if you look at the upstream repo issues (https://github.com/Upplication/Amazon-S3-FileSystem-NIO2/issues) you'll see that the latest one is "is this dead?". I've brought this up a few times on image.sc. Ultimately, we will likely need to work on a single implementation as a community.
All seems to be working fine. Closing this for now. Reopen if needed.
bump the tif-to-zarr conversion from 0.2 to the latest stable version (0.4.0). currently using this image (https://hub.docker.com/layers/bioformats2raw/openmicroscopy/bioformats2raw/0.4.0/images/sha256-29e650dca4610898d2c5d7639c350f172d3f4d0d0aea7078454b76e10245b0c7?context=explore)
Vitessce works with this version as well.
Tho, the conversion is currently only done locally. Use this option to write directly to s3. https://github.com/glencoesoftware/bioformats2raw/pull/89