Open bsiranosian opened 2 years ago
A note that I'll be looking into this for python as well!
an alternative we could look into:
Currently doesn't look like they have s3 support?
Can help out with this, but it's been almost a decade since I've written anything in R.
I have a couple of tasks I need to work out by EOW, but I'll give a shot at implemented S3 subsetting (preferably in python, but can figure it out for R) if it's not implemented in our preferred solution. There's a number of libraries I've encountered like this where it's just not really there. :\
No worries if you don't have a good immediate solution - just tagged you for visibility since I thought you might have a good idea. I believe we're facing the same issue with the Python implementation as well.
Data subsetting when reading directly from S3 does not currently work when implemented like this:
Instead, the whole file is downloaded to a temp directory, and a portion of it is read from there.
This should be possible as rhdf5 supports read-only access to files in S3: https://www.bioconductor.org/packages/devel/bioc/vignettes/rhdf5/inst/doc/rhdf5_cloud_reading.html
However, I'm currently hit with the error described here, and haven't gone any further: https://support.bioconductor.org/p/9134972/