mlerotic / spectromicroscopy

MANTiS is Multivariate ANalysis Tool for Spectromicroscopy developed in Python by 2nd Look Consulting. It uses principal component analysis and cluster analysis to classify pixels according to spectral similarity.
http://spectromicroscopy.com/
GNU General Public License v3.0
9 stars 12 forks source link

Feature request: File plugin for hdf5 file from Msitral, ALBA #26

Open liuchzzyy opened 6 days ago

liuchzzyy commented 6 days ago

Hi, All Recently I obtained data from Mistral (ALBA, Spain), with hdf5 format. I noticed the data structure seems not the same as you implemented. if possible to rewrite the hdf5 reader to include this kind of hdf5? the hdf5 file is quite big, and you could download it for test if you would try.

hdf5_keys

Much thanks!

Grumium commented 6 days ago

Hi. Yes, it certainly is possible to create a new file plugin. Since this is an open-source project, it thrives on contributions from the community. You're welcome to fork the code and implement the feature yourself. If you need help with the code, feel free to reach out, but unfortunately, I don’t have the bandwidth to implement every specific request myself.

It would be great if you decide to contribute and possibly submit a pull request!

liuchzzyy commented 5 days ago

Thanks for your reply. But it is ashamed that I am just a new learner with python now. I checked the file_dataexch_hdf5.py in mantis, and played with it in the morning. It seems we just need to add a new class. But now it is not working for me. I am so grateful for having coding help from you. For this time, I could not do it by myself. The HDF5 file from alba could be opened with h5py. 

Thanks Cheng

Grumium commented 5 days ago

Hi Cheng, Each file format in this program has a plugin, and the structure is fairly similar. You want to look at file_nexus_hdf5.py and adapt it according to your needs. image

I suggest you start by forking the repository and either modifying the relevant plugin file or creating a new one. From there, you can aim for a first testable version. Another, potentially easier approach, especially for beginners, would be to write a stand-alone Python script that parses your data into a format compatible with MANTiS. The NeXus format is likely the closest match and is well-documented. Once you've got something, feel free to share it, and I'd be happy to help review and provide feedback on the script! Cheers, Jan-David

liuchzzyy commented 2 days ago

Hi, Jan-David Backing, As you mentioned, I tried to convert the file into NeXus format. and now file_nexus_hdf5.py could read it, but it does not show correctly in Qt. I guess there are some attributions in NeXus format I don't know, here is my code, based on this:

## import packages
from IPython.display import display
from pathlib import Path as path
import numpy as np
import datetime
import h5py
import os
from dxchange import reader as reader

## read file with dxchange
path_file = path.joinpath(path_folder, r'20230701_F6Mn_245.2x1203.3y_specnorm.hdf5')

## Read the keys to find dataset, data_key is a list shown as below
data_keys = reader.read_hdf_meta(path_file, add_shape=True)
display(data_keys)

# Read all needed dataset via keys
data = reader.read_hdf5(path_file, dataset=r'SpecNormalized/spectroscopy_normalized', shared=False)
energy = reader.read_hdf5(path_file, dataset=r'SpecNormalized/energy', shared=False)
x_pixel_size = reader.read_hdf5(path_file, dataset=r'SpecNormalized/x_pixel_size', shared=False)
y_pixel_size = reader.read_hdf5(path_file, dataset=r'SpecNormalized/y_pixel_size', shared=False)
currents = reader.read_hdf5(path_file, dataset=r'SpecNormalized/Currents', shared=False)
exptimes = reader.read_hdf5(path_file, dataset=r'SpecNormalized/ExpTimes', shared=False)
rotation_angle = reader.read_hdf5(path_file, dataset=r'SpecNormalized/rotation_angle', shared=False)

## rewirte it with NeXus format

root = h5py.File(path.joinpath(path_folder, f'{path_file.stem}.hdf5'), 'w')

## Create the GROUPS 
root.create_group('entry')
root['/entry'].attrs['NX_class'] = 'NXentry'

root['/entry/'].create_group('sample')
root['/entry/sample'].attrs['NX_class'] = 'NXsample'

root['/entry/'].create_group('data')
root['/entry/data'].attrs['NX_class'] = 'NXdata'

## Valid enumeration values for root['/entry']['definition'] are: 

root['/entry'].create_dataset(name='definition', data=['NXstxm',], maxshape=None)
root['/entry/definition'].attrs['type'] = 'NX_CHAR'

root['/entry/sample'].create_dataset(name='rotation_angle', data=rotation_angle, maxshape=None)
root['/entry/sample/rotation_angle'].attrs['type'] = 'NX_FLOAT'

root['/entry/sample'].create_dataset(name='Currents', data=currents, maxshape=None)
root['/entry/sample/Currents'].attrs['type'] = 'NX_FLOAT'
root['/entry/sample/Currents'].attrs['EX_required'] = 'true'

root['/entry/sample'].create_dataset(name='ExpTimes', data=exptimes, maxshape=None)
root['/entry/sample/ExpTimes'].attrs['type'] = 'NX_FLOAT'

# Valid enumeration values for root['/entry/data']['stxm_scan_type'] are: 
# sample image stack

root['/entry/data'].create_dataset(name='stxm_scan_type', data=['sample image stack',], maxshape=None)
root['/entry/data/stxm_scan_type'].attrs['type'] = 'NX_CHAR'

root['/entry/data'].create_dataset(name='count_time', data=[2,], maxshape=None)
root['/entry/data/count_time'].attrs['type'] = 'NX_FLOAT'

root['/entry/data'].create_dataset(name='data', data=data, maxshape=None)
root['/entry/data/data'].attrs['type'] = 'NX_NUMBER'
root['/entry/data/data'].attrs['signal'] = '1'

root['/entry/data'].create_dataset(name='energy', data=energy, maxshape=None)
root['/entry/data/energy'].attrs['type'] = 'NX_FLOAT'
root['/entry/data/energy'].attrs['units'] = 'eV'
root['/entry/data/energy'].attrs['axis'] = 0

root['/entry/data'].create_dataset(name='sample_y', data=np.linspace(0, data.shape[0]*y_pixel_size, num=data.shape[1]), maxshape=None)
root['/entry/data/sample_y'].attrs['type'] = 'NX_FLOAT'
root['/entry/data/sample_y'].attrs['units'] = 'um'
root['/entry/data/sample_y'].attrs['axis'] = 1

root['/entry/data'].create_dataset(name='sample_x', data=np.linspace(0, data.shape[1]*x_pixel_size, num=data.shape[2]), maxshape=None)
root['/entry/data/sample_x'].attrs['type'] = 'NX_FLOAT'
root['/entry/data/sample_x'].attrs['units'] = 'um'
root['/entry/data/sample_x'].attrs['axis'] = 2

## Create the ATTRIBUTES 
root['/'].attrs['default'] = 'entry'
root['/entry'].attrs['default'] = 'data'
root['/entry/data'].attrs['signal'] = 'data'
root['/entry/data/data'].attrs['signal'] = '1'
root.attrs['file_name'] = os.path.abspath('NXstxm')
root.attrs['file_time'] = datetime.datetime.now().isoformat()
root.attrs['h5py_version'] = h5py.version.version
root.attrs['HDF5_Version'] = h5py.version.hdf5_version

## Close the file
root.close()

and then mantis told me

(mantis) C:\Users\chengliu>python -m mantis_xray
Loading file plugin: file_bim . Not a valid plugin - skipping.
Loading file plugin: file_csv . (text table) Success!
Loading file plugin: file_dataexch_hdf5 . (Exchange) Success!
Loading file plugin: file_json . (JSON) Success!
Loading file plugin: file_ncb . (Ncb) Success!
Loading file plugin: file_nexus_hdf5 . (NXstxm) Success!
Loading file plugin: file_sdf . (SDF) Success!
Loading file plugin: file_sm_netcdf . Not a valid plugin - skipping.
Loading file plugin: file_stk . (STK) Success!
Loading file plugin: file_tif . (Tiff) Success!
Loading file plugin: file_xrm . (XRM) Success!
=======================
Welcome to MANTiS 3.2.2
=======================
Latest package on PyPI is version 3.2.2
Current default (master) code is version 3.2.3
Current development code is version 3.2.1
PyQt version in use is 5.15.11
PyQtGraph version in use is 0.13.7

Please report issues to https://github.com/mlerotic/spectromicroscopy/issues

Identifying file: C:/Users/chengliu/Desktop/Figure/20230701_F6Mn_245.2x1203.3y_specnorm.hdf5 ... get info from C:/Users/chengliu/Desktop/Figure/20230701_F6Mn_245.2x1203.3y_specnorm.hdf5 with the NXstxm plugin.
load C:/Users/chengliu/Desktop/Figure/20230701_F6Mn_245.2x1203.3y_specnorm.hdf5 with the NXstxm plugin.
Traceback (most recent call last):
  File "C:\Users\chengliu\AppData\Local\miniconda3\envs\mantis\lib\site-packages\mantis_xray\mantis_qt.py", line 14412, in OnLoadMulti
    self.window().LoadStack()
  File "C:\Users\chengliu\AppData\Local\miniconda3\envs\mantis\lib\site-packages\mantis_xray\mantis_qt.py", line 16796, in LoadStack
    self.page1.absimgfig.loadNewImageWithROI()
  File "C:\Users\chengliu\AppData\Local\miniconda3\envs\mantis\lib\site-packages\mantis_xray\mantis_qt.py", line 16238, in loadNewImageWithROI
    self.parent.stk.calc_histogram()
  File "C:\Users\chengliu\AppData\Local\miniconda3\envs\mantis\lib\site-packages\mantis_xray\data_stack.py", line 300, in calc_histogram
    fluxmax_limit = np.mean(np.partition(np.ravel(self.averageflux), px)[
  File "C:\Users\chengliu\AppData\Local\miniconda3\envs\mantis\lib\site-packages\numpy\_core\fromnumeric.py", line 870, in partition
    a.partition(kth, axis=axis, kind=kind, order=order)
ValueError: kth(=935413) out of bounds (221616)

I have no idea how to modify the NeXus format now. The attribution count_time, axis, and signalsurely I didn't know what they used for mantis.

thanks you if you could guide me for this. Best Cheng