Write the erased image in the data type of the input image

dmichalak commented 1 year ago

Is it possible for the output image to be written in the same data type as the input image? Currently, fidder always writes in 32-bit and can greatly inflate the file size.

I've forked the repository and tried changing mrcfile.write in https://github.com/teamtomo/fidder/blob/main/src/fidder/erase/cli.py to write with np.float16 but the header in the output says the map mode is 12 (16-bit float).

51 mrcfile.write(
52         name=output_image,
53         data=np.array(erased_images, dtype=np.float16),
54         voxel_size=pixel_spacing,
55         overwrite=True,
56     )

A less elegant work around I have is to use newstack to rewrite the output mrc in map mode 1.

alisterburt commented 1 year ago

Hey Dennis,

Your message implies that your input images contain 16 bit integers (map mode 1), where are you getting motion corrected images that contain integer values?

I've forked the repository and tried changing mrcfile.write in https://github.com/teamtomo/fidder/blob/main/src/fidder/erase/cli.py to write with np.float16 but the header in the output says the map mode is 12 (16-bit float).

this seems like the expected outcome to me... is this unexpected for you? The file size should also be equivalent to a file containing 16 bit integers written by IMOD (map mode 1)

I can definitely add a check for 16 bit input and write 16 bit output if that's the case but will wait for your response to the above before moving forwards

dmichalak commented 1 year ago

Hi Alister,

I might be misunderstanding something here regarding data types. We use alignframes to generate the motion corrected tilt frames and do not specify any map mode. Is it unusual to use integers instead of floats?

Here are the headers of a tilt movie and a motion corrected tilt stack:

 RO image file on unit   1 : frames/B9_tomo_80_2.0_Jan05_19.48.16.mrc     Size=     368281 K

 Number of columns, rows, sections .....    8184   11520       4
 Map mode ..............................    0   (byte)                     
 Start cols, rows, sects, grid x,y,z ...    0     0     0    8184  11520      4
 Pixel spacing (Angstroms)..............   1.082      1.082      1.082    
 Cell angles ...........................   90.000   90.000   90.000
 Fast, medium, slow axes ...............    X    Y    Z
 Origin on x,y,z .......................    0.000       0.000       0.000    
 Minimum density .......................   0.0000    
 Maximum density .......................   77.000    
 Mean density ..........................   5.6367    
 tilt angles (original,current) ........   0.0   0.0   0.0   0.0   0.0   0.0
 Space group,# extra bytes,idtype,lens .        0        0        0        0

     1 Titles :
SerialEMCCD: Frames . ., scaled by 0.50  r/f 7          05-Jan-23  19:48:18

RO image file on unit   1 : ts001/test_ts001_bin10.mrc     Size=      93866 K

 Number of columns, rows, sections .....     818    1152      51
 Map mode ..............................    1   (16-bit integer)           
 Start cols, rows, sects, grid x,y,z ...    0     0     0     818   1152     51
 Pixel spacing (Angstroms)..............   10.82      10.82      10.82    
 Cell angles ...........................   90.000   90.000   90.000
 Fast, medium, slow axes ...............    X    Y    Z
 Origin on x,y,z .......................    0.000       0.000       0.000    
 Minimum density .......................   86.000    
 Maximum density .......................   3877.0    
 Mean density ..........................   1995.4    
 tilt angles (original,current) ........   0.0   0.0   0.0   0.0   0.0   0.0
 Space group,# extra bytes,idtype,lens .        0      204        0        0

     5 Titles :
SerialEM: Digitized on Titan Krios @ B37 Bioquantum-K3  05-Jan-23  22:13:27    
    Tilt axis angle = 178.9, binning = 0.5  spot = 5  camera = 0 dosym = 0.0   
alignframes: summed frames scaled by 1, reduced 10      21-Jun-23  14:47:46

alisterburt commented 1 year ago

@dmichalak aligning images typically means applying subpixel shifts. If you have a single pixel with a value of one in an array of zeros and shift it half a pixel how do you represent the result with only integer values?

What's happening in IMOD is likely accepting some loss of precision by saving as integers - I suspect this because support for 16 bit floats (mode 12) in the MRC spec was only added relatively recently, alignframes was written much earlier

dmichalak commented 1 year ago

@alisterburt thanks for pointing this out. Including -mode 2 with alignframes seems to work well. This is turning out to make quite a difference for mid to high-resolution information (perhaps unsurprisingly). When comparing a tomogram of 16-bit ints to 32-bit floats, I can clearly see small changes in ribosome densities at 10.8 A/px.

teamtomo / fidder

Write the erased image in the data type of the input image #31