If a file is hosted publicly on S3 and a user without AWS credentials set up must use fs_kwargs: BioImage("s3://bucketname/path/to/file", fs_kwargs=dict(anon=True)). (I'm thinking specifically about OME ZARRs, but this is likely relevant to all readers.)
Instead, bioio should be able to handle this internally and let the user write BioImage("s3://bucketname/path/to/file").
Solution
As far as I can tell, the proper way to check if a user is authenticated to read a file is to attempt to read it and see if there's an error, so the solution I think is to try to read files twice with logic similar to the following.
try:
# __init__ with user's fs_kwargs
except SomethingSpecific as e:
if protocol == "s3://":
# __init__ with user's fs_kwargs plus {anon: True}
else:
raise e
Alternatives
Looks like they tried it in s3fs but was reverted “unfortunately, it led to far more problems than it solved. I’d be happy to see a more solid implementation, if some wants to try.”
This came up again with our internal scientists. We need a good fix for this so our users can not waste time figuring out why their s3 urls don't work.
Feature Description
If a file is hosted publicly on S3 and a user without AWS credentials set up must use
fs_kwargs
:BioImage("s3://bucketname/path/to/file", fs_kwargs=dict(anon=True))
. (I'm thinking specifically about OME ZARRs, but this is likely relevant to all readers.)Instead, bioio should be able to handle this internally and let the user write
BioImage("s3://bucketname/path/to/file")
.Solution
As far as I can tell, the proper way to check if a user is authenticated to read a file is to attempt to read it and see if there's an error, so the solution I think is to try to read files twice with logic similar to the following.
Alternatives
Looks like they tried it in s3fs but was reverted “unfortunately, it led to far more problems than it solved. I’d be happy to see a more solid implementation, if some wants to try.”