hyperspy / rosettasciio

Python library for reading and writing scientific data format
https://hyperspy.org/rosettasciio
GNU General Public License v3.0
44 stars 26 forks source link

Improve units conversion in EMD velox files #243

Closed ThorstenBASF closed 1 month ago

ThorstenBASF commented 3 months ago

Describe the bug

If an image has a different number of pixels in X and Y, it can happen that axes are set to different units. In my case I had an image of 256x512 with an original pixelsize of 5.1e-09m. This is converted to 5.1nm in X and 0.0051µm in Y.

This leads to an error while I want to save the image with a scalebar, due the sanity check of the file_writer. (see io_plugins/image.py, line 147)

To Reproduce

(The testfile contains confidential data, so i can't share, if nessessary I could creat one with public data.) Steps to reproduce the behavior:

import hyperspy.api as hs
myfile = "testfile.emd"
s = hs.load(myfile, select_type='images')
s[0].save(myfile[:-4])+".jpg"), scalebar=True)

Expected behavior

Not sure where to fix it. I think best would be to have same units even if X Y has a ratio of 1:2, but here I didn't checked(found) the code. The santiy check could also be extended to compare different units, but that would be much more effort.

Python environment:

Additional context

Images of original_metadata and axes. As a workaround I just set [1]=[0] for scale and units.

original_metadata 2d_signal_hyperspy

CSSFrancis commented 3 months ago

@ThorstenBASF This is related to how the data is loaded downstream:

https://github.com/hyperspy/rosettasciio/blob/0646b2c62e3b8a99a0905f2e4382de3bf5f7a18d/rsciio/emd/_emd_velox.py#L715-L723

Moving this issue to Rosettasciio.

ericpre commented 3 months ago

@ThorstenBASF for opening this issue. Indeed, even if the current behaviour is still correct, it is not convenient and it would be better to keep the same units when possible (same dimension). The issue is that when loading emd velox file, the units are converted automatically for conveninence (for example: to get units in "nm" or "um" instead of "m", which is the units saved in the emd file for the calibration) but axes are converted independently and it is possible that in some corner cases, there end up being different.

This could be improved by checking that both units are different and pick one of the two. What criterion should we use to pick one of the two? @ThorstenBASF, in your specific case, which one of the two ("nm" or "um") would make more sense? I can't think of an objective criterion and I think this is arbitrary. The main impact is to have small or large number in the display of the scalebar. Also it we consider large aspect ratio which is not uncommon, then I suspect that it wouldn't be possible to find a criterion that fit all corner cases!

As a workaround, it is possible to change the units of the axes using the AxesManager conver_units or the axes convert_to_units.

ThorstenBASF commented 3 months ago

@ericpre Thanks for taking care and moving the issues the the right place.

This could be improved by checking that both units are different and pick one of the two. What criterion should we use to pick one of the two? @ThorstenBASF, in your specific case, which one of the two ("nm" or "um") would make more sense?

Most of time you want a scalebar for the X axes. So, I would prefer to pick units from the X axes.

As a workaround, it is possible to change the units of the axes using the AxesManager conver_units or the axes convert_to_units.

Thanks! I just used one more line and set Y=X for scale and units :)

ThorstenBASF commented 3 months ago

@ThorstenBASF This is related to how the data is loaded downstream:

I see, I just read a bit into the code.

The units used for calculations are only from the X axis anyway. https://github.com/hyperspy/rosettasciio/blob/0646b2c62e3b8a99a0905f2e4382de3bf5f7a18d/rsciio/emd/_emd_velox.py#L323

So, instead of convert scale_x and scale_y, only x would be necessary. And set Y=X.

In addition there is a bug with the offset calculation. Here also the offset units could differ from the scale unit, but scale unit is used for both. (As in my case, the X offset should be -730 nm, not -0.73). So offset needs an additional conversation to same units as the units from scale_x[1], before it is extended to 'axes'.

https://github.com/hyperspy/rosettasciio/blob/0646b2c62e3b8a99a0905f2e4382de3bf5f7a18d/rsciio/emd/_emd_velox.py#L358-L391

ericpre commented 3 months ago

@ThorstenBASF, would you be interested to make a pull request to do these changes?

If so, there are some guidance contributor guide (which mostly refer to the hyperspy contributor guide).

ThorstenBASF commented 3 months ago

I can't promise anything. I will first have to read me into all the "convert" commands and handling. Besides, this would be more or less private "joy". :)

But, I will see what I can do.

ThorstenBASF commented 3 months ago

@ThorstenBASF, would you be interested to make a pull request to do these changes?

https://github.com/hyperspy/rosettasciio/pull/248

Hope this can be closed. :)

ericpre commented 1 month ago

Done in #248.