Open mmarras opened 5 years ago
Hi.
Do you have any question?
I mean, if it is working with PIL, it's fine no? Why would you need to convert it to fabio?
Here is what we have with fabio.
There is not much, IFAIK
In [4]: import fabio
In [5]: a = fabio.open("Dummy.tiff")
In [6]: a
Out[6]: <fabio.tifimage.TifImage at 0x7f27f7a540b8>
In [7]: a.nframes
Out[7]: 1
In [8]: a.header
Out[8]:
{'nRows': 619,
'nColumns': 487,
'nBits': 32,
'compression': False,
'compression_type': 1,
'imageDescription': '# Pixel_size 172e-6 m x 172e-6 m\r\n# Silicon sensor, thickness 0.000320 m\r\n# Exposure_time 900.000000 s\r\n# Exposure_period 906.000000 s\r\n# Tau = 383.8e-09 s\r\n# Count_cutoff 1077896 counts\r\n# Threshold_setting: 4024 eV\r\n# Gain_setting: high gain (vrf = -0.150)\r\n# N_excluded_pixels = 14\r\n# Excluded_pixels: badpix_mask.tif\r\n# Flat_field: FF_p300k0149_E8048_T4024_vrf_m0p15.tif\r\n# Trim_file: p300k0149_E8048_T4024_vrf_m0p15.bin\r\n# Image_path: /data/datatemp/\r\n# Ratecorr_lut_directory: (nil)\r\n# Retrigger_mode: 0\r\n',
'stripOffsets': [6060],
'rowsPerStrip': 619,
'stripByteCounts': [1205812],
'software': 'TVX TIFF v 1.3 ',
'date': '2018:04:12 12:33:59',
'colormap': None,
'sampleFormat': 2,
'photometricInterpretation': 1,
'model': ('PILATUS 300K-20Hz, S/N 3-0149-20Hz',),
'info': {}}
But if you want to hack the code, you could take a look at the way fabio manage the Pilatus TIFF images.
Here is a way to enforce using this reader.
In [22]: image = fabio.pilatusimage.PilatusImage()
In [23]: image.read("Dummy.tiff")
Out[23]: <fabio.pilatusimage.PilatusImage at 0x7f27f709c400>
In [24]: image.header
Out[24]:
{
"Pixel_size": "172e-6 m x 172e-6 m",
"Silicon": "sensor, thickness 0.000320 m",
"Exposure_time": "900.000000 s",
"Exposure_period": "906.000000 s",
"Tau": "383.8e-09 s",
"Count_cutoff": "1077896 counts",
"Threshold_setting": "4024 eV",
"Gain_setting": "high gain (vrf = -0.150)",
"N_excluded_pixels": "14",
"Excluded_pixels": "badpix_mask.tif",
"Flat_field": "FF_p300k0149_E8048_T4024_vrf_m0p15.tif",
"Trim_file": "p300k0149_E8048_T4024_vrf_m0p15.bin",
"Image_path": "/data/datatemp/",
"Ratecorr_lut_directory": "(nil)",
"Retrigger_mode": "0"
}
The code is inside fabio/codecs/pilatusimage.py
Hi.
Do you have any question?
I mean, if it is working with PIL, it's fine no? Why would you need to convert it to fabio?
Indeed, I can make it work with PIL.
But I have already quite some code (batch processing) etc. that I used for edf files etc which uses the very convenient img = fabio.open() and then exposes a img.header. Thus, it would be nice if I could tweak fabio to expose the Ganesha meta-data in addition to the others. So let's say adding a new key to the info dict besides imageDescription
But if you want to hack the code, you could take a look at the way fabio manage the Pilatus TIFF images.
Here is a way to enforce using this reader.
In [22]: image = fabio.pilatusimage.PilatusImage() In [23]: image.read("Dummy.tiff") Out[23]: <fabio.pilatusimage.PilatusImage at 0x7f27f709c400> In [24]: image.header Out[24]: { "Pixel_size": "172e-6 m x 172e-6 m", "Silicon": "sensor, thickness 0.000320 m", "Exposure_time": "900.000000 s", "Exposure_period": "906.000000 s", "Tau": "383.8e-09 s", "Count_cutoff": "1077896 counts", "Threshold_setting": "4024 eV", "Gain_setting": "high gain (vrf = -0.150)", "N_excluded_pixels": "14", "Excluded_pixels": "badpix_mask.tif", "Flat_field": "FF_p300k0149_E8048_T4024_vrf_m0p15.tif", "Trim_file": "p300k0149_E8048_T4024_vrf_m0p15.bin", "Image_path": "/data/datatemp/", "Ratecorr_lut_directory": "(nil)", "Retrigger_mode": "0" }
The code is inside
fabio/codecs/pilatusimage.py
Oh, I was looking at the TIFFIO.py for a while now. Where I found the TAG_ID
dictionary and the attribute self._readIFDEntry(TAG_name, tagIDList, fieldTypeList, nValuesList, valueOffsetList)
to retrieve the header? But I'll have a look at Pilatusimage now.
No you was right. Everything is inside TIffIO.py
, cause it only expose some tag types.
I check and you have to update the tag info (maybe not needed)
And update _readInfo
with
TAG_ARTIST = 315
artist = self._readIFDEntry(
TAG_ARTIST, tagIDList, fieldTypeList, nValuesList, valueOffsetList
)
and
info["artist"] = artist
Yet it is not easy to custom. I think it would be great to create a kind of TAG_INFO that we could patch manually, in case.
This was helpful. Basically, the main issue was name 'artist', I had just made something up there which apparently didn't work. But now it's great. How did you inspect that it had to be TAG_ARTIST?
Added the following to TiffIO.py:
from html.parser import HTMLParser
TAG_ID.update(315: "Artist")
TAG_ARTIST = 315
class MyHTMLParser(HTMLParser):
store = {}
def make_newlist(self):
self.store.clear()
def handle_starttag(self, tag, attrs):
for attr in attrs:
self.store.update({attr[1]:None})
#print(attr[1])
def handle_data(self, data):
self.store[list(self.store.keys())[-1]]=data
#print(data)
def get_header(self):
return self.store.copy()
parser = MyHTMLParser()
and added this to _readInfo(...)
in TIFFIO.py
:
...
if TAG_ARTIST in tagIDList:
artist = self._readIFDEntry(TAG_ARTIST,
tagIDList, fieldTypeList, nValuesList,
valueOffsetList)
self.parser.reset()
self.parser.feed(artist[0])
artist = self.parser.get_header()
self.parser.make_newlist()
...
info["artist"] = artist
It's probably cleaner to do the HTML parsing at another location, but now I can use my original import routine and
image.header
returns
{'nRows': 619, 'nColumns': 487, 'nBits': 32, 'compression': False, 'compression_type': 1, 'imageDescription': '# Pixel_size 172e-6 m x 172e-6 m\r\n# Silicon sensor, thickness 0.000320 m\r\n# Exposure_time 120.0 s\r\n# Exposure_period 124.0 s\r\n# Tau = 383.8e-09 s\r\n# Count_cutoff 1226757 counts\r\n# Threshold_setting: 4024 eV\r\n# Gain_setting: NA\r\n# N_excluded_pixels = 14\r\n# Excluded_pixels: badpix_mask.tif\r\n# Flat_field: (nil)\r\n# Trim_file: p300k0149_E8048_T4024_vrf_m0p15.bin\r\n# Image_path: NA\r\n', 'stripOffsets': [5890], 'rowsPerStrip': 619, 'stripByteCounts': [1205812], 'software': 'TVX TIFF v 1.3 ', 'date': '2019:05:17 17:52:45', 'colormap': None, 'artist': {'det_pixel_size': '172e-6 172e-6', 'det_thickness': '0.000320', 'det_exposure_time': '120.0', 'det_exposure_period': '124.0', 'det_tau': '383.8e-09', 'det_count_cutoff': '1226757', 'det_threshold_setting': '4024', 'det_n_excluded_pixels': '14', 'det_excluded_pixels': 'badpix_mask.tif', 'det_flat_field': '(nil)', 'det_trim_directory': 'p300k0149_E8048_T4024_vrf_m0p15.bin', 'datatype': 'tiff', 'detectortype': 'Pilatus', 'detector_function': 'saxs', 'detector_sn': 'dec427', 'meastype': None, 'start_timestamp': 'Fri May 17 17:57:43 2019', 'end_timestamp': None, 'save_timestamp': None, 'realtime': None, 'livetime': '120.00', 'pixelsize': '0.172 0.172', 'beamcenter_nominal': '364.80 213.50', 'beamcenter_actual': '364.76 213.78', 'WAXSdet_conf': None, 'data_mean': None, 'data_min': None, 'data_max': None, 'data_rms': None, 'data_p10': None, 'data_p90': None, 'calibrationtype': 'geom', 'kcal': None, 'pixelcal': None, 'koffset': None, 'wavelength': '1.5418', 'detector_dist': '120.4470', 'saxsconf_r1': '0.4500', 'saxsconf_r2': '2.0000', 'saxsconf_r3': '0.3500', 'saxsconf_l1': '725', 'saxsconf_l2': '400', 'saxsconf_l3': '200', 'saxsconf_wavelength': '1.5418', 'saxsconf_dwavelength': '0.004', 'saxsconf_Imon': None, 'saxsconf_Ieff': '1.12500', 'saxsconf_Izero': None, 'saxsconf_det_offx': '0', 'saxsconf_det_offy': '0', 'saxsconf_det_rotx': '0', 'saxsconf_det_roty': '0', 'saxsconf_det_pixsizez': '0.172', 'saxsconf_det_pixsizey': '0.172', 'saxsconf_det_resx_0': None, 'saxsconf_det_resy_0': None, 'saxsconf_abs_int_fact': None, 'sample_transfact': None, 'sample_thickness': None, 'sample_ypos': '-2.900', 'sample_zpos': '-6.500', 'sample_angle1': '0.000', 'sample_angle2': None, 'sample_angle3': None, 'sample_temp': '25.00', 'sample_pressure': None, 'sample_strain': None, 'sample_stress': None, 'sample_shear_rate': None, 'sample_concentration': None, 'sample_buffer': None, 'sample_ID': None, 'hg1': '0.899987', 'hp1': '0.028067', 'vg1': '0.899987', 'vp1': '0.000000', 'hg2': '4.000000', 'hp2': '0.000006', 'vg2': '4.000000', 'vp2': '0.000000', 'hg3': '0.700000', 'hp3': '0.063373', 'vg3': '0.700000', 'vp3': '-0.053773', 'ysam': '-2.900000', 'zsam': '-6.500000', 'thsam': '0.000005', 'detx': '5.000000', 'dety': '-0.187500', 'detz': '-6.578988', 'bstop': '37.349925', 'pd': '30.000000', 'chi': '32.968594', 'phi': '9.700000', 'trans': '38.625760', 'source_type': 'GENIX3D', 'source_runningtime': None, 'source_kV': '49.93', 'source_ma': '0.60', 'xaxis': None, 'xaxisfull': None, 'yaxis': None, 'error_norm_fact': '1', 'xaxisbintype': 'lin', 'log': 'log', 'reduction_type': 's', 'reduction_state': None, 'raw_filename': None, 'bsmask_configuration': '0 364.80 213.50 28.0 205.00 14.0', 'mask_filename': None, 'flatfield_filename': None, 'empty_filename': None, 'solvent_filename': None, 'darkcurrent_filename': None, 'readoutnoise_filename': None, 'zinger_removal': '0', 'data_added_constant': '0', 'data_multiplied_constant': '1', 'Img.Class': None, 'Img.MonitorMethod': None, 'Img.ImgType': '2D', 'Img.Site': 'TUM', 'Img.Group': None, 'Img.Researcher': None, 'Img.Operator': None, 'Img.Administrator': None, 'Meas.Description': None}, 'sampleFormat': 2, 'photometricInterpretation': 1, 'model': ('PILATUS 300K-20Hz, S/N 3-0149-20Hz',), 'info': {}}
and obviously image.header['artist']
gives me the Ganesha meta-data only.
Just the question remains what happens if one opens a non-SAXSLAB/Ganesha Tiff now. Thanks a lot!
How did you inspect that it had to be TAG_ARTIST?
Cause 315, from your code, is the artist tag: https://www.awaresystems.be/imaging/tiff/tifftags/artist.html
Just the question remains what happens if one opens a non-SAXSLAB/Ganesha Tiff now. In your case, it will raise an error, as
info["artist"] = artist
should be executed only if the artist tag is available. In the general caseinfo["artist"]
should not be part of the dict.
If you want to provide a PR for the patch of TiffIO, i can try to look at it. Obviously the HTML parsing have to stay on your side, as it have nothing to do with pilatus, if i understand well.
So how to go about it? Although it's mainly tweaking TiffIO
, for a PR shouldn't it be rather implemented in a new ganeshaimage.py
à la templateimage.py
? To clarify, this format is obtained from the SAXSLAB Ganesha system which internally has a Pilatus detector installed.
Alternatively we don't care about where the tiff is coming from in this specific case and make it very broad: if for whatever reason the tiff has an "artist" tag it should also be exposed in the header? Maybe the latter was what you were thinking?
Actually, maybe both would be good.
put everything to expose the artist tag in TiffIO.py
and then
the html parsing in Ganeshaimage.py
.
Only question is then how to determine that it's using Ganeshaimage.py
instead of TiffIO.py
as entry point the image parsing? How/where is the automatic file-type detection happening? I read fabio tries to detect the type automatically and only if that fails uses the ending of the file (not that this would be helpful in this case), because it's all .tiff.
I think exposing the artist tag is enough. Creating and maintaining a Ganeshaimage
is not really part of our project. But maybe @kif have another opinion.
Hi there,
I don't agree with you Valentin: Creating a GaneshaImage class deriving from TiffImage (or PilatusImage) is probably the way to go (even if ESRF has no direct interest in supporting SAXSLAB hardware). FabIO has been built to support all kind of X-ray detectors to allow people to do better science by removing the burden of parsing the data files.
So, Matthias, could you please help us in submit a pull-request for this feature. I promise I will take some time to help you on the way.
Then comes the issue of how to distinguish the file from a "basic" tiff of from another coming from a Pilatus detector ... I will have to think about it
@kif so to clarify,
you think I should follow my suggested approach:
- put everything to expose the artist tag in
TiffIO.py
and then- the html parsing in
Ganeshaimage.py
.
Yes, sounds good. Also can we reuse this dummy.tiff file for our unittests?
Next we could find a way to autodetect the Ganeshaimage
from fabio.open
. That's something missing. But we can take care of it on our side (cause that's a need for other derivative tiff formats too).
I recently revisited this. And turned out SAXSLab introduces double entries (e.g. 'detector_dist') which lead to undesired behavior for the HTMLparser in retrieving the meta data as a dict (I don't know if this behavior is peculiar to our device/setup or standard).
I have now improved the HTMLparser to deal with double entries. In case those are not mere double entries but the values are actually different, this is now raised with the user, but meta data extraction proceeds with the first value encountered.
import warnings
class MyHTMLParser(HTMLParser):
_store = {} # dict for storing metadata
_doublestore = {} # dict for storing duplicates
def cleardict(self):
self._store.clear()
self._doublestore.clear()
def handle_starttag(self, tag, attrs):
self._doubleentryflag = None
for attr in attrs:
# catch double entries
if attr[1] not in self._store:
self._doubleentryflag = False
self._store.update({attr[1]:None})
else:
self._doubleentryflag = True
self._doublestore.update({attr[1]:None})
#print(attr[1])
def handle_data(self, data):
if self._doubleentryflag is False:
#store data in dict
self._store[list(self._store.keys())[-1]]=data
if self._doubleentryflag is True:
#store double entry data to raise with user
self._doublestore[list(self._doublestore.keys())[-1]]=data
def get_header(self):
# compare values from _store and _doublestore for keys that are double
different_entries = {k: self._store[k] for k in self._store if k in self._doublestore and self._store[k] != self._doublestore[k]}
# raise with user if those double entries contain conflicting data
if len(different_entries) is not 0:
warnings.warn('Confliciting double entrty in meta data {}. Proceeding with first value given, respectively.'.format([str(key)+' = ' +str(self._store[key])+'; '+str(self._doublestore[key]) for key in different_entries.keys()]))
# prepare output dict
output = self._store.copy()
# cleanup dict to prepare next parsing
self.cleardict()
return output
Hi, I am now facing the pain to modify every fabio installation of my colleagues to make the Ganesha patch work for them, too. Therefore, I'd like to work a bit on getting this into the main functionality of fabio. Is there now a more clearer picture on where I should put that functionality? I am currently trying a monkey patch, which already gives me some insight into the fabio package. But I think I would need some guidance on how to proceed with this please.
Monkey-patching is great, but only for quick fixes. It hardly ever scales (gevent :þ). The best is always to have the code properly written somewhere so that it can be debugged when needed.
About the localization of your code: create a new class deriving from TifImage or PilatusImage in a new file. PilatusImage is deriving TifImage so you can do the same.
Then write a test to ensure it works. Sounds obvious but python version are changing, ... so ensuring non regression is essential to be future proof. It does not guarantee the compatibility with the future but at least we will be warned when it fails.
Once this is done, you file will not be "auto-magically" be recognized by fabio.open
but that's another story.
I am using pyfai for my datareduction of SAXSLAB Ganesha / pilatus tiff files and, thus, fabio for the image import.
The tiff-file contains two sets of x-ray relevant meta data tags. One for the pilatus detector saved in tag 270 and one for the SAXSLAB machine data which is provided in a html format.
Using PIL I found that the SAXSLAB html meta data is saved in tag 315.
I also managed to parse the html meta data into a dict. However, I am not familiar enough with fabio (but I am willing to learn) to inject this so that I can expose a complemented header consisting of both the Pilatus meta data and the SAXSLAB meta data in
fabio.tifimage.TifImage.header
.Here is my code to parse the SAXSLAB meta data using PIL instead of fabio.
which returns:
Here is a link to a dummy SAXSLAB Ganesha file: Dummy.tiff