ercius / openNCEM

A collection of packages and tools for electron microscopy data analysis supported by the National Center for Electron Microscopy facility of the Molecular Foundry
GNU General Public License v3.0
60 stars 28 forks source link

ser2EMD conversion issue and RingDiff dims issue #3

Closed ercius closed 7 years ago

ercius commented 7 years ago

I suggested to Karen that she use EMDviewer to convert DM3 files to EMD files. There is then an issue with the dims vectors in Ring diffraction. @fniekiel probably already know about this.

The issue is two fold. 1) ser2emd needs to be changed slightly to avoid loading single image data as ndarray.shape = (1,numX,numY). This should just be ndarray.shape= (numX,numY). At the very end of the ser Image import you should add a np.squeeze() to remove singular dimensions. The EMD file will only have dim1 and dim2 just as with an EMD file created by EMDviewer. (Note: I changed the behavior of my serReader to take care of this same issue yesterday! Check out bitbucket/openNCEM to see the change).

2) With 1) fixed, single SER images converted by ser2emd will now only have 2 dimensions and not 3. Then the dims 2-tuple will only have len(dims) = 2 rather than 3. Then in Ring Diffraction all dim[x][y] need to be changed to dims[x-1][y] to compensate.

This will make EMDviewer export and ser2emd export produce the same /data folders with the same number of dim HDF5 datasets. Both programs can be used to create EMD files and then used in Ring Diffraction.

fniekiel commented 7 years ago

Hi Peter,

yes, that was pretty much the issue I noticed shortly before leaving and could not fix quickly.

It started from the fact, that actually any SER file saves data in a series, so even single images are 1-member series. Therefore I got aware of this problem rather lately. I agree to reducing single images in EMD do avoid unnecessary dimensions. Its just an extra case I need to implement in io.ser.

After changing this back and forth in RingDiff, I came to the conclusion that we have to decide on the following design questions first, as it touches things beyond RingDiff: In which order to we want to save dimensions in EMD?

I think HDF5/h5py forces us to save series in (n, y, x) order, to have the fast changing indices last. Otherwise we run into performance issues. (n, x, y) seems to be possible without loosing to much, though.

Then the question is, whether we want single images as (y, x) or (x,y). (y,x) seems to be more consistent with the series and is putting the fast changing index last. (x,y) seems to be more intuitive, even though images usually are stored in matrices the (y,x) way and the functions somehow adopt for that.

Another possibility for RingDiff would be to not only implement the two specific cases of single images and image series, but rather let the user choose which dimension in their EMD files should be interpreted as what.

What's your opinion on the order question?

ercius commented 7 years ago

Then the question is, whether we want single images as (y, x) or (x,y). (y,x) seems to be more consistent with the series and is putting the fast changing index last. (x,y) seems to be more intuitive, even though images usually are stored in matrices the (y,x) way and the functions somehow adopt for that.

It seems to me that the performance penalty of [y,x] vs [x,y] is negligible in most cases. If you operate on a full image (which is usually the case) then it wont matter at all. Most modern array handling modules like Numpy are able to choose the fastest index if possible anyway. Check out this article about Numpy.

I would choose [y,x] to be consistent with the naming convention (i.e. [num,y,x]). I hate to say this, but Ive been following the [x,y,num] convention since I started Python programming as a habit from Matlab. Im planning to switch most of my code to [num,y,x].

Lastly, its important that the image look on the screen the same as it was taken on the microscope to avoid confusion.

Another possibility for RingDiff would be to not only implement the two specific cases of single images and image series, but rather let the user choose which dimension in their EMD files should be interpreted as what.

I think its worthwhile to implement the choice option. (Says he who will likely not do the programming. Haha.) Im happy to pitch in at some point though! I think that EMDviewer writes things out the wrong way [x,y,num and Im not sure how quickly we can get that changed. @cophus and I should discuss that.

fniekiel commented 7 years ago

I have modified io.ser to put series in (n,y,x) and single images in (y,x). RingDiff has been modified to work with both and I have fixed a bug in center selecting. It is all in branch 'iss3', if anyone wants to have a look at it. I think we could merge it into the development branch.

I have postponed the choice option for later, as that sounds like rather rare use cases. Maybe we can also get something like that into io.emd, however I do not want any Qt stuff in the package. I think we should focus on implementing the DM3 reader and get a converter to get EMDs with dimensions in the above order. Right now it also works properly with the DM3s from emdviewer, only x and y are flipped for viewing.

ercius commented 7 years ago

I couldnt test this on the iss3 branch, because of the time tag issue (#4). However, it seems that iss4 branch includes this [n,y,x] array ordering. It worked on the time series data.

ercius commented 7 years ago

I have postponed the choice option for later, as that sounds like rather rare use cases. Maybe we can also get something like that into io.emd, however I do not want any Qt stuff in the package. I think we should focus on implementing the DM3 reader and get a converter to get EMDs with dimensions in the above order. Right now it also works properly with the DM3s from emdviewer, only x and y are flipped for viewing.

I forgot to reply to this. I agree about postponing the choice option. I have started writing the dm3Reader.py. Ill try to make it match the ser.py so we can keep the style of the packages the same. Ill probably need your help @fniekiel to make that happen. Ill add it to the repo once I have it working for at least 1 file.

Ill also try to pull the new version of the RingDiffraction program to the Titan for Karen once we merge the fixed to ser.py ( #3 and #4) .

fniekiel commented 7 years ago

I have merged the fixes to #4 and will close this issue. We should open a new one for the dm3Reader.