yayahjb / cbflib

CBFlib repository cloned from SF CBFlib repository as of 1 Dec 15
8 stars 20 forks source link

cbf2nexus appears to have an O(n^2) feature #1

Open graeme-winter opened 7 years ago

graeme-winter commented 7 years ago

Using cbf2nexus to make an NXmx file from a few hundred CBF files I find that the initial file compression tasks are very fast:

Time to convert 'l-cyst_01_00005.cbf': 0.050s Time to read 'l-cyst_01_00006.cbf': 0.000s Time to convert 'l-cyst_01_00006.cbf': 0.070s Time to read 'l-cyst_01_00007.cbf': 0.000s Time to convert 'l-cyst_01_00007.cbf': 0.060s Time to read 'l-cyst_01_00008.cbf': 0.000s Time to convert 'l-cyst_01_00008.cbf': 0.070s Time to read 'l-cyst_01_00009.cbf': 0.000s Time to convert 'l-cyst_01_00009.cbf': 0.070s Time to read 'l-cyst_01_00010.cbf': 0.000s Time to convert 'l-cyst_01_00010.cbf': 0.050s Time to read 'l-cyst_01_00011.cbf': 0.010s Time to convert 'l-cyst_01_00011.cbf': 0.060s Time to read 'l-cyst_01_00012.cbf': 0.000s Time to convert 'l-cyst_01_00012.cbf': 0.080s Time to read 'l-cyst_01_00013.cbf': 0.000s Time to convert 'l-cyst_01_00013.cbf': 0.060s Time to read 'l-cyst_01_00014.cbf': 0.000s Time to convert 'l-cyst_01_00014.cbf': 0.070s Time to read 'l-cyst_01_00015.cbf': 0.000s Time to convert 'l-cyst_01_00015.cbf': 0.070s Time to read 'l-cyst_01_00016.cbf': 0.010s Time to convert 'l-cyst_01_00016.cbf': 0.060s Time to read 'l-cyst_01_00017.cbf': 0.000s Time to convert 'l-cyst_01_00017.cbf': 0.070s Time to read 'l-cyst_01_00018.cbf': 0.000s

However after a few hundred images these slow down a great deal:

Time to convert 'l-cyst_01_00686.cbf': 1.690s Time to read 'l-cyst_01_00687.cbf': 0.000s Time to convert 'l-cyst_01_00687.cbf': 1.690s Time to read 'l-cyst_01_00688.cbf': 0.010s Time to convert 'l-cyst_01_00688.cbf': 1.700s Time to read 'l-cyst_01_00689.cbf': 0.000s Time to convert 'l-cyst_01_00689.cbf': 1.690s Time to read 'l-cyst_01_00690.cbf': 0.010s Time to convert 'l-cyst_01_00690.cbf': 1.740s Time to read 'l-cyst_01_00691.cbf': 0.000s Time to convert 'l-cyst_01_00691.cbf': 1.720s Time to read 'l-cyst_01_00692.cbf': 0.000s Time to convert 'l-cyst_01_00692.cbf': 1.720s Time to read 'l-cyst_01_00693.cbf': 0.000s Time to convert 'l-cyst_01_00693.cbf': 1.720s Time to read 'l-cyst_01_00694.cbf': 0.000s Time to convert 'l-cyst_01_00694.cbf': 1.740s Time to read 'l-cyst_01_00695.cbf': 0.000s Time to convert 'l-cyst_01_00695.cbf': 1.750s Time to read 'l-cyst_01_00696.cbf': 0.000s Time to convert 'l-cyst_01_00696.cbf': 1.740s Time to read 'l-cyst_01_00697.cbf': 0.000s Time to convert 'l-cyst_01_00697.cbf': 1.750s Time to read 'l-cyst_01_00698.cbf': 0.010s Time to convert 'l-cyst_01_00698.cbf': 1.750s Time to read 'l-cyst_01_00699.cbf': 0.000s Time to convert 'l-cyst_01_00699.cbf': 1.760s Time to read 'l-cyst_01_00700.cbf': 0.000s Time to convert 'l-cyst_01_00700.cbf': 1.760s Time to read 'l-cyst_01_00701.cbf': 0.000s

Thought it may be worth checking in the code to see if there is some loop which scales as #frames squared or similar?

graeme-winter commented 7 years ago

Final wall clock times attached

graeme-winter commented 7 years ago

times

graeme-winter commented 7 years ago

X is # frames Y is time to convert / frame

graeme-winter commented 7 years ago

Finally also interesting, it did not work right, final HDF5 file contains only one image... (which is more of a thing) - perhaps I did not use it right?

cbf2nexus -o l-cyst_01.h5 -c zlib l-cyst_01_0*cbf

graeme-winter commented 7 years ago
         DATASET "data" {
            DATATYPE  H5T_STD_I32LE
            DATASPACE  SIMPLE { ( 1, 1679, 1475 ) / ( H5S_UNLIMITED, 1679, 1475 ) }
            DATA {
            (0,0,0): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,0,19): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,0,37): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0,
            (0,0,55): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
            (0,0,73): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,0,91): 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,0,109): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
            (0,0,127): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,0,145): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,0,163): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,0,181): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,0,199): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
....
            (0,1678,1270): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1287): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1304): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1321): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1338): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1355): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1372): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1389): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1406): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1423): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1440): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
            (0,1678,1457): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            (0,1678,1474): 0
            }
            ATTRIBUTE "signal" {
               DATATYPE  H5T_STD_I32LE
               DATASPACE  SCALAR
               DATA {
               (0): 1
               }
            }
         }
graeme-winter commented 7 years ago

minicbf2nexus appears to work as expected i.e. including more frames makes the h5 files larger. Does also appear to exhibit increasing time per frame for including extra data....

graeme-winter commented 7 years ago

minicbf2nexus worked well (though did slow down in the compression from ~ 0.2 seconds / image early on to 2.4 seconds / image at the end. Resulting data file works correctly in DIALS.

graeme-winter commented 7 years ago

For information - timing seems better behaved when using bslz4 compression => issue probably not in minicbf2nexus code per se.

yayahjb commented 7 years ago

Dear Graeme,

The way the NXmx structure works is that each image is a slice of a larger array.

Regards, Herbert

On Wed, Oct 26, 2016 at 5:45 AM, Graeme Winter notifications@github.com wrote:

Finally also interesting, it did not work right, final HDF5 file contains only one image... (which is more of a thing) - perhaps I did not use it right?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yayahjb/cbflib/issues/1#issuecomment-256299559, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPiAdjfho72__AHeAjPQOuwBuFe-EJrks5q3yE1gaJpZM4Kg4X2 .

yayahjb commented 7 years ago

Dear Herbert

First hundred of these should work fine

https://sandbox.zenodo.org/record/51405/files/l-cyst_01.tar.gz

(data small - whole run here < 200MB)

Best wishes Graeme

On 26 Oct 2016, at 15:34, Herbert J. Bernstein yayahjb@gmail.com wrote:

Dear Graeme,

Could you give me access to a few of those cbf's?

Regards, Herbert

On Wed, Oct 26, 2016 at 10:17 AM, Graeme Winter <graeme.winter@gmail.com mailto:graeme.winter@gmail.com> wrote: Dear Herbert

As follows:

/home/gw56/svn/cbflib_build/bin/cbf2nexus -c zlib -o test.h5 l-cyst_01_000*cbf

Best wishes

Graeme

On 26 Oct 2016, at 14:48, Herbert J. Bernstein <yayahjb@gmail.com mailto:yayahjb@gmail.com> wrote:

Dear Graeme,

Exactly how did you run it?

Regards, Herbert

On Wed, Oct 26, 2016 at 7:32 AM, Graeme Winter <graeme.winter@gmail.com mailto:graeme.winter@gmail.com> wrote: Dear Herbert

I appreciate this however the first index into this array ranges from 0 to 0 i.e. there is only one slice there.

Best wishes Graeme

On Wed, Oct 26, 2016 at 11:58 AM Herbert J. Bernstein <yayahjb@gmail.com mailto:yayahjb@gmail.com> wrote: Dear Graeme,

The way the NXmx structure works is that each image is a slice of a larger array.

Regards, Herbert

On Wed, Oct 26, 2016 at 5:45 AM, Graeme Winter <notifications@github.com mailto:notifications@github.com> wrote: Finally also interesting, it did not work right, final HDF5 file contains only one image... (which is more of a thing) - perhaps I did not use it right?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yayahjb/cbflib/issues/1#issuecomment-256299559, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPiAdjfho72__AHeAjPQOuwBuFe-EJrks5q3yE1gaJpZM4Kg4X2.

yayahjb commented 7 years ago

Dear Herbert

I appreciate this however the first index into this array ranges from 0 to 0 i.e. there is only one slice there.

Best wishes Graeme

On Wed, Oct 26, 2016 at 11:58 AM Herbert J. Bernstein yayahjb@gmail.com wrote:

Dear Graeme,

The way the NXmx structure works is that each image is a slice of a larger array.

Regards, Herbert

On Wed, Oct 26, 2016 at 5:45 AM, Graeme Winter notifications@github.com wrote:

Finally also interesting, it did not work right, final HDF5 file contains only one image... (which is more of a thing) - perhaps I did not use it right?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yayahjb/cbflib/issues/1#issuecomment-256299559, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPiAdjfho72__AHeAjPQOuwBuFe-EJrks5q3yE1gaJpZM4Kg4X2 .

yayahjb commented 7 years ago

Dear Graeme,

Thank you. I'll look into it.

Regards, Herbert

On Wed, Oct 26, 2016 at 4:16 AM, Graeme Winter notifications@github.com wrote:

Using cbf2nexus to make an NXmx file from a few hundred CBF files I find that the initial file compression tasks are very fast:

Time to convert 'l-cyst_01_00005.cbf': 0.050s Time to read 'l-cyst_01_00006.cbf': 0.000s Time to convert 'l-cyst_01_00006.cbf': 0.070s Time to read 'l-cyst_01_00007.cbf': 0.000s Time to convert 'l-cyst_01_00007.cbf': 0.060s Time to read 'l-cyst_01_00008.cbf': 0.000s Time to convert 'l-cyst_01_00008.cbf': 0.070s Time to read 'l-cyst_01_00009.cbf': 0.000s Time to convert 'l-cyst_01_00009.cbf': 0.070s Time to read 'l-cyst_01_00010.cbf': 0.000s Time to convert 'l-cyst_01_00010.cbf': 0.050s Time to read 'l-cyst_01_00011.cbf': 0.010s Time to convert 'l-cyst_01_00011.cbf': 0.060s Time to read 'l-cyst_01_00012.cbf': 0.000s Time to convert 'l-cyst_01_00012.cbf': 0.080s Time to read 'l-cyst_01_00013.cbf': 0.000s Time to convert 'l-cyst_01_00013.cbf': 0.060s Time to read 'l-cyst_01_00014.cbf': 0.000s Time to convert 'l-cyst_01_00014.cbf': 0.070s Time to read 'l-cyst_01_00015.cbf': 0.000s Time to convert 'l-cyst_01_00015.cbf': 0.070s Time to read 'l-cyst_01_00016.cbf': 0.010s Time to convert 'l-cyst_01_00016.cbf': 0.060s Time to read 'l-cyst_01_00017.cbf': 0.000s Time to convert 'l-cyst_01_00017.cbf': 0.070s Time to read 'l-cyst_01_00018.cbf': 0.000s

However after a few hundred images these slow down a great deal:

Time to convert 'l-cyst_01_00686.cbf': 1.690s Time to read 'l-cyst_01_00687.cbf': 0.000s Time to convert 'l-cyst_01_00687.cbf': 1.690s Time to read 'l-cyst_01_00688.cbf': 0.010s Time to convert 'l-cyst_01_00688.cbf': 1.700s Time to read 'l-cyst_01_00689.cbf': 0.000s Time to convert 'l-cyst_01_00689.cbf': 1.690s Time to read 'l-cyst_01_00690.cbf': 0.010s Time to convert 'l-cyst_01_00690.cbf': 1.740s Time to read 'l-cyst_01_00691.cbf': 0.000s Time to convert 'l-cyst_01_00691.cbf': 1.720s Time to read 'l-cyst_01_00692.cbf': 0.000s Time to convert 'l-cyst_01_00692.cbf': 1.720s Time to read 'l-cyst_01_00693.cbf': 0.000s Time to convert 'l-cyst_01_00693.cbf': 1.720s Time to read 'l-cyst_01_00694.cbf': 0.000s Time to convert 'l-cyst_01_00694.cbf': 1.740s Time to read 'l-cyst_01_00695.cbf': 0.000s Time to convert 'l-cyst_01_00695.cbf': 1.750s Time to read 'l-cyst_01_00696.cbf': 0.000s Time to convert 'l-cyst_01_00696.cbf': 1.740s Time to read 'l-cyst_01_00697.cbf': 0.000s Time to convert 'l-cyst_01_00697.cbf': 1.750s Time to read 'l-cyst_01_00698.cbf': 0.010s Time to convert 'l-cyst_01_00698.cbf': 1.750s Time to read 'l-cyst_01_00699.cbf': 0.000s Time to convert 'l-cyst_01_00699.cbf': 1.760s Time to read 'l-cyst_01_00700.cbf': 0.000s Time to convert 'l-cyst_01_00700.cbf': 1.760s Time to read 'l-cyst_01_00701.cbf': 0.000s

Thought it may be worth checking in the code to see if there is some loop which scales as #frames squared or similar?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yayahjb/cbflib/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPiAbBktc8cMkdUl_cc3D6jamT9QfmAks5q3wx1gaJpZM4Kg4X2 .

yayahjb commented 7 years ago

Thanks.

On Wed, Oct 26, 2016 at 10:37 AM, Graeme Winter graeme.winter@gmail.com wrote:

Dear Herbert

First hundred of these should work fine

https://sandbox.zenodo.org/record/51405/files/l-cyst_01.tar.gz

(data small - whole run here < 200MB)

Best wishes Graeme

On 26 Oct 2016, at 15:34, Herbert J. Bernstein yayahjb@gmail.com wrote:

Dear Graeme,

Could you give me access to a few of those cbf's?

Regards, Herbert

On Wed, Oct 26, 2016 at 10:17 AM, Graeme Winter graeme.winter@gmail.com wrote:

Dear Herbert

As follows:

/home/gw56/svn/cbflib_build/bin/cbf2nexus -c zlib -o test.h5 l-cyst_01_000*cbf

Best wishes

Graeme

On 26 Oct 2016, at 14:48, Herbert J. Bernstein yayahjb@gmail.com wrote:

Dear Graeme,

Exactly how did you run it?

Regards, Herbert

On Wed, Oct 26, 2016 at 7:32 AM, Graeme Winter graeme.winter@gmail.com wrote:

Dear Herbert

I appreciate this however the first index into this array ranges from 0 to 0 i.e. there is only one slice there.

Best wishes Graeme

On Wed, Oct 26, 2016 at 11:58 AM Herbert J. Bernstein yayahjb@gmail.com wrote:

Dear Graeme,

The way the NXmx structure works is that each image is a slice of a larger array.

Regards, Herbert

On Wed, Oct 26, 2016 at 5:45 AM, Graeme Winter < notifications@github.com> wrote:

Finally also interesting, it did not work right, final HDF5 file contains only one image... (which is more of a thing) - perhaps I did not use it right?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yayahjb/cbflib/issues/1#issuecomment-256299559, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPiAdjfho72__AHeAjPQOuwBuFe-EJrks5q3yE1gaJpZM4Kg4X2 .

yayahjb commented 7 years ago

Dear Graeme,

Could you give me access to a few of those cbf's?

Regards, Herbert

On Wed, Oct 26, 2016 at 10:17 AM, Graeme Winter graeme.winter@gmail.com wrote:

Dear Herbert

As follows:

/home/gw56/svn/cbflib_build/bin/cbf2nexus -c zlib -o test.h5 l-cyst_01_000*cbf

Best wishes

Graeme

On 26 Oct 2016, at 14:48, Herbert J. Bernstein yayahjb@gmail.com wrote:

Dear Graeme,

Exactly how did you run it?

Regards, Herbert

On Wed, Oct 26, 2016 at 7:32 AM, Graeme Winter graeme.winter@gmail.com wrote:

Dear Herbert

I appreciate this however the first index into this array ranges from 0 to 0 i.e. there is only one slice there.

Best wishes Graeme

On Wed, Oct 26, 2016 at 11:58 AM Herbert J. Bernstein yayahjb@gmail.com wrote:

Dear Graeme,

The way the NXmx structure works is that each image is a slice of a larger array.

Regards, Herbert

On Wed, Oct 26, 2016 at 5:45 AM, Graeme Winter <notifications@github.com

wrote:

Finally also interesting, it did not work right, final HDF5 file contains only one image... (which is more of a thing) - perhaps I did not use it right?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yayahjb/cbflib/issues/1#issuecomment-256299559, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPiAdjfho72__AHeAjPQOuwBuFe-EJrks5q3yE1gaJpZM4Kg4X2 .

yayahjb commented 7 years ago

Dear Graeme,

Exactly how did you run it?

Regards, Herbert

On Wed, Oct 26, 2016 at 7:32 AM, Graeme Winter graeme.winter@gmail.com wrote:

Dear Herbert

I appreciate this however the first index into this array ranges from 0 to 0 i.e. there is only one slice there.

Best wishes Graeme

On Wed, Oct 26, 2016 at 11:58 AM Herbert J. Bernstein yayahjb@gmail.com wrote:

Dear Graeme,

The way the NXmx structure works is that each image is a slice of a larger array.

Regards, Herbert

On Wed, Oct 26, 2016 at 5:45 AM, Graeme Winter notifications@github.com wrote:

Finally also interesting, it did not work right, final HDF5 file contains only one image... (which is more of a thing) - perhaps I did not use it right?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yayahjb/cbflib/issues/1#issuecomment-256299559, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPiAdjfho72__AHeAjPQOuwBuFe-EJrks5q3yE1gaJpZM4Kg4X2 .

yayahjb commented 7 years ago

Dear Herbert

As follows:

/home/gw56/svn/cbflib_build/bin/cbf2nexus -c zlib -o test.h5 l-cyst_01_000*cbf

Best wishes

Graeme

On 26 Oct 2016, at 14:48, Herbert J. Bernstein yayahjb@gmail.com wrote:

Dear Graeme,

Exactly how did you run it?

Regards, Herbert

On Wed, Oct 26, 2016 at 7:32 AM, Graeme Winter <graeme.winter@gmail.com mailto:graeme.winter@gmail.com> wrote: Dear Herbert

I appreciate this however the first index into this array ranges from 0 to 0 i.e. there is only one slice there.

Best wishes Graeme

On Wed, Oct 26, 2016 at 11:58 AM Herbert J. Bernstein <yayahjb@gmail.com mailto:yayahjb@gmail.com> wrote: Dear Graeme,

The way the NXmx structure works is that each image is a slice of a larger array.

Regards, Herbert

On Wed, Oct 26, 2016 at 5:45 AM, Graeme Winter <notifications@github.com mailto:notifications@github.com> wrote: Finally also interesting, it did not work right, final HDF5 file contains only one image... (which is more of a thing) - perhaps I did not use it right?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yayahjb/cbflib/issues/1#issuecomment-256299559, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPiAdjfho72__AHeAjPQOuwBuFe-EJrks5q3yE1gaJpZM4Kg4X2.

graeme-winter commented 7 years ago

Dear Herbert,

Also with minicbf2nexus I don't think compression filters != zlib are working right:

[gw56@cs03r-sc-serv-16 INS1_1]$ du -hs *h5
112G    INS1_1_bslz4.h5
8.4G    INS1_1.h5
112G    INS1_1_lz4-2.h5

from

/home/gw56/svn/cbflib_build/bin/minicbf2nexus -C config -c bslz4 -o INS1_1_bslz4.h5 INS1_1_*cbf

/home/gw56/svn/cbflib_build/bin/minicbf2nexus -C config -c 'lz4**2' -o INS1_1_lz4-2.h5 INS1_1_*cbf

i.e. zlib filter gives small data set (original 4800 images 28GB in CBF format - gzip compressed these are around 4GB) - 112 GB feels like raw uncompressed int32's - built using cmake on RHEL6... Not in the office for a few days but will take a look when I am back in.

Best wishes Graeme

yayahjb commented 7 years ago

That usually means the dynamic load of filters failed, I'll try to track it down.

On Thu, Oct 27, 2016 at 3:37 AM, Graeme Winter notifications@github.com wrote:

Dear Herbert,

Also with minicbf2nexus I don't think compression filters != zlib are working right:

[gw56@cs03r-sc-serv-16 INS1_1]$ du -hs *h5 112G INS1_1_bslz4.h5 8.4G INS1_1.h5 112G INS1_1_lz4-2.h5

from

/home/gw56/svn/cbflib_build/bin/minicbf2nexus -C config -c bslz4 -o INS1_1_bslz4.h5 INS11*cbf

/home/gw56/svn/cbflib_build/bin/minicbf2nexus -C config -c 'lz4*_2' -o INS1_1_lz4-2.h5 INS1_1__cbf

i.e. zlib filter gives small data set (original 4800 images 28GB in CBF format - gzip compressed these are around 4GB) - 112 GB feels like raw uncompressed int32's - built using cmake on RHEL6... Not in the office for a few days but will take a look when I am back in.

Best wishes Graeme

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/yayahjb/cbflib/issues/1#issuecomment-256569199, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPiAfFdQDfOJzr3_pNnzp2UZS33XY6Dks5q4FS1gaJpZM4Kg4X2 .

yayahjb commented 7 years ago

The timing problem appears to be an inefficiently written axis search. Reworking it to use a hash table. -- HJB