mobie / mobie-viewer-fiji

BSD 2-Clause "Simplified" License
29 stars 12 forks source link

Access HCS data stored as tif from S3 #1076

Closed constantinpape closed 6 months ago

constantinpape commented 6 months ago

We would like to access HCS data stored as tif (in the format explained in #1075) directly from S3.

What would you think is the best way to implement this, @tischi ?

constantinpape commented 6 months ago

Quoting @tischi from #1075:

This may work to open TIFF from S3, using our current mobie-io code base:

InputStream inputStream = IOHelper.getInputStream( s3address );
ImagePlus imagePlus = ( new Opener() ).openTiff( inputStream, "name" );
tischi commented 6 months ago

This is one of those horrible cases, where the TIFF cannot be opened with the normal IJ TIFF opener.

I really do not know what these companies are thinking.....

Thus the above will not work and we have to use BioFormats instead :-(

I will have to dig into this.

constantinpape commented 6 months ago

I really do not know what these companies are thinking.....

I would just say they're actually not thinking at all ...

Thanks for looking into this so quickly!!!

constantinpape commented 6 months ago

So from what I understand from https://bio-formats.readthedocs.io/en/latest/developers/in-memory.html it is possible to read with bioformats from a filestream, correct?

tischi commented 6 months ago

Good news, I can read from S3 using the above link (it is not fast though).

       for ( int i = 0; i < 5; i++ )
        {
            start = System.currentTimeMillis();
            File inputFile = new File( "/Users/tischer/Downloads/incu-test-data/2207/19/1110/262/B3-1-C2.tif" );
            int fileSize = ( int ) inputFile.length();
            DataInputStream in = new DataInputStream( new FileInputStream( inputFile ) );
            byte[] inBytes = new byte[ fileSize ];
            in.readFully( inBytes );
            //System.out.println( fileSize + " bytes read from File." );
            // map input id string to input byte array
            Location.mapFile( "file.tif", new ByteArrayHandle( inBytes ) );
            ImagePlus imagePlusMapped = MoBIEHelper.openWithBioFormats( "file.tif", 0 );
            //imagePlus.show();
            System.out.println( "Mapped File [ms]: " + ( System.currentTimeMillis() - start ) );

            // S3
            start = System.currentTimeMillis();
            InputStream inputStream = IOHelper.getInputStream( "https://s3.embl.de/i2k-2020/incu-test-data/2207/19/1110/262/B3-1-C2.tif" );
            ByteArrayOutputStream buffer = new ByteArrayOutputStream();
            int nRead;
            byte[] data = new byte[ 1024 ];
            while ( ( nRead = inputStream.read( data, 0, data.length ) ) != -1 )
            {
                buffer.write( data, 0, nRead );
            }
            buffer.flush();
            byte[] byteArray = buffer.toByteArray();
            //System.out.println( byteArray.length + " bytes read from S3." );
            Location.mapFile( "s3.tif", new ByteArrayHandle( byteArray ) );
            ImagePlus imagePlusS3 = MoBIEHelper.openWithBioFormats( "s3.tif", 0 );
            //imagePlusS3.show();
            System.out.println( "S3 [ms]: " + ( System.currentTimeMillis() - start ) );
        }

yields

Mapped File [ms]: 2755
S3 [ms]: 6473
Mapped File [ms]: 83
S3 [ms]: 2245
Mapped File [ms]: 90
S3 [ms]: 1281
Mapped File [ms]: 69
S3 [ms]: 1269
Mapped File [ms]: 77
S3 [ms]: 1296
tischi commented 6 months ago

This works now by downloading the whole object into a byte[] and the using Location.map from BioFormats to pretend that this is a file. Maybe not ideal, but probably out of scope to improve this.