Closed pdudenas closed 8 months ago
Image.energy is not strongly typed, is the problem. Within progress_apply
(just a souped-up apply
) if slicing directly on energy it's a float (may change with xarray version), if slicing on a multiindex that includes energy it's a DataArray
.
If you look at the analogous lines in the chunked-reduction-dev
branch, I might have fixed this while doing the Dask prototyping? I don't see a problem with the .ravel()[0]
solution per se, just worth keeping the if-else for edge cases. You can also get it out of a 0D array by casting to float which might be simpler. Maybe a better version would be to try to cast to float (handles case where it's 0D and where it already is a float) and then catch the resulting exception, warn, and use first value? THat still breaks if it's a Dask array where you have to .read()
it first but it is logically cleaner.
I'd also suggest copying the warning/guard I added over on chunked-reduction-dev
.
I'll have to try the chunked-reduction-dev
branch, but it does look like you fixed this issue there. Is there a remaining set of to-dos on that branch before merging?
I also have an issue with this. Is there any temporary hack that I should know about or is it on its way to getting fixed?
Actually I get this with the datatype being of numpy.float64. I can change the attribute energy to be a float though
Try installing the latest pre-production release with 'pip install -i https://test.pypi.org/simple --pre --upgrade pyhyperscattering' and see if that fixes it? What loader is being used to generate the data -- SST1RSoXSDB or something else?
I now get a different error. And yes this is SST1 RSoXS.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File [~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:45](https://file+.vscode-resource.vscode-cdn.net/home/j/programming/work/scicat_beamline/~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:45), in PFEnergySeriesIntegrator.integrateSingleImage(self, img)
44 try:
---> 45 en = img.energy.values[0]
46 if len(img.energy)>1:
AttributeError: 'numpy.float64' object has no attribute 'values'
During handling of the above exception, another exception occurred:
IndexError Traceback (most recent call last)
[/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering.ipynb](https://file+.vscode-resource.vscode-cdn.net/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering.ipynb) Cell 12 in ()
----> [1](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering.ipynb#X14sZmlsZQ%3D%3D?line=0) integrated_data = integ.integrateSingleImage(data)
[3](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering.ipynb#X14sZmlsZQ%3D%3D?line=2) # the way that PyHyperScattering handles the energy dimension[/axis](https://file+.vscode-resource.vscode-cdn.net/axis) is a pain, so we clean it up now
[4](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering.ipynb#X14sZmlsZQ%3D%3D?line=3) integrated_data = integrated_data.unstack('system')
File [~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:51](https://file+.vscode-resource.vscode-cdn.net/home/j/programming/work/scicat_beamline/~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:51), in PFEnergySeriesIntegrator.integrateSingleImage(self, img)
49 en = float(img.energy)
50 except AttributeError:
---> 51 en = img.energy[0]
52 warnings.warn(f'Using the first energy value of {img.energy}, check that this is correct.',stacklevel=2)
53 else:
IndexError: invalid index to scalar variable.
It seems like it is not handling the type numpy.float64 properly. It is assuming that the energy attribute is an xarray.
Gotcha, yes, I agree with the diagnosis. Probably a change in the SST1 file format but we ought to handle this case anyway. I'm hoping to find a minute today to get a patch up for this. Just adding another try/except to the existing tree of possible types of energy...
Can you point me to a dataset (scan number is fine) that I can test with? A copy of how you're invoking the load and integrating steps would also be a big help. Thanks!
The scan id is 36950 I believe.
Here is the code for invoking the load and integrating:
# From a Jupyter notebook by Matt Landsman
import PyHyperScattering
import os
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from PyHyperScattering import __version__
print(f'You are now using PyHyperScattering, version: {__version__}')
# input scan info --> "data_path" represents the master SST1_RSoXS folder where you store everything
data_path = "/home/j/programming/work/datasets/RSoXS_SST1/ingest_round2/2022-02-07/"
scan_id = '36950/'
# turn flags on to save the .txt files and plots
flag_save = False
image_path = os.path.join(data_path, scan_id)
save_path = os.path.join(data_path,'SST1_RSoXS_data', scan_id + '_reduced')
os.makedirs(os.path.join(data_path, 'SST1_RSoXS_data' , str(scan_id + '_reduced')), exist_ok=True)
print(image_path)
file_loader = PyHyperScattering.load.SST1RSoXSLoader(corr_mode='none')
data = file_loader.loadSingleImage(image_path+"36950-bw30_snomNa-dark-Wide Angle CCD Detector_image-7.tiff")
if data.rsoxs_config == 'waxs':
maskmethod = 'nika'
mask = os.path.join(data_path, 'SST1_RSoXS_masks', 'SST1_WAXS_mask.hdf')
elif data.rsoxs_config == 'saxs':
maskmethod = 'nika'
mask = os.path.join(data_path, 'SST1_RSoXS_masks', 'SST1_SAXS_mask.hdf')
else:
maskmethod = 'none'
# set up integration parameters, typically shouldn't need to change anything here
integ = PyHyperScattering.integrate.PFEnergySeriesIntegrator(maskmethod=maskmethod,
maskpath=mask,
geomethod='template_xr',
template_xr=data,
integration_method='csr_ocl')
integrated_data = integ.integrateSingleImage(data)
Additionally, I get this error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
[/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering_altered.ipynb](https://file+.vscode-resource.vscode-cdn.net/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering_altered.ipynb) Cell 12 in ()
----> [1](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering_altered.ipynb#X14sZmlsZQ%3D%3D?line=0) integrated_data = integ.integrateSingleImage(data)
[3](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering_altered.ipynb#X14sZmlsZQ%3D%3D?line=2) # the way that PyHyperScattering handles the energy dimension[/axis](https://file+.vscode-resource.vscode-cdn.net/axis) is a pain, so we clean it up now
[4](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering_altered.ipynb#X14sZmlsZQ%3D%3D?line=3) integrated_data = integrated_data.unstack('system')
File [~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:61](https://file+.vscode-resource.vscode-cdn.net/home/j/programming/work/scicat_beamline/~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:61), in PFEnergySeriesIntegrator.integrateSingleImage(self, img)
59 res = super().integrateSingleImage(img)
60 try:
---> 61 if len(self.dest_q)>0:
62 return res.interp(q=self.dest_q)
63 else:
AttributeError: 'PFEnergySeriesIntegrator' object has no attribute 'dest_q'
Do you have any idea why this is? I am not giving my own mask to the integrator so maybe that is explaining it?
So, integrateSingleImage
is really more of a worker function, honestly it probably should be renamed _integrateSingleImage
because it expects (many) outer features of the integrator to be set up first. The relevant bits are here, but you need to call integrator.setupIntegrators()
and integrator.setupDestQ()
.
Alternately, you could just try .integrateImageStack()
on your single image and see if that works -- it ought to, but there are likely to be many other instances where energy
is expected to be an array, because in a true energy series measurement it is, by definition, an array. If it's a single-energy measurement, then the appropriate integration machinery is really a PFGeneralIntegrator
.
Basically, understanding your use case a little better might be helpful.
That said, integration of freestanding images rather than just stacks is certainly within API scope for the package, so I would like to get this working, but understanding the context might yield a faster work-around.
I am basically using loadSingleImage to load one image and then I try to integrate that. I tried using the loadFileSeries with regex to load one image and then integrate but it will throw an error if the regex only matches one file.
I think the best workaround is to load two images with a regex OR pattern, integrate both, and display the one I want.
I see! Good news: I think the 56-integratesingleimage-indexing-error branch has a complete fix for this. It was a matter of dealing with some of the scan stacking logic and adding a cut-out for single-image stacks with no indexes at all. Can you install this branch with, e.g., pip install git+https://github.com/usnistgov/PyHyperScattering@56-integratesingleimage-indexing-error
and see if that gets it working? In my test, you can now either do a integ.integrateImageStack()
or integ.integrateSingelmage()
on a single frame with either a PFEnergySeriesIntegrator
or a PFGeneralIntegrator
. I tried this with a single frame loaded from SST1 files SST1RSoXSLoader
and with data streaming from Tiled using SST1RSoXSDB
. Give it a shot and let me know if it's working and I'll open a PR.
It seems to integrate a single image fine for my case.
One other issue I discovered with loadSingleImage. It does not assign coords automatically to the data like loadFileSeries does. loadFileSeries has this line:
if not output_qxy and not output_raw:
out = out.assign_coords(pix_x=('pix_x',np.arange(0,len(out.pix_x))),pix_y=('pix_y',np.arange(0,len(out.pix_y))))
Actually not sure if this is an issue. It seems inconsistent to me but there might be a reason for it.
I tried using the loadFileSeries with regex to load one image and then integrate but it will throw an error if the regex only matches one file.
This is wrong on my part as well.
This is an interesting one. The assignment of the pix_x
and pix_y
axes is essentially in loadFileSeries for historical reasons. It might be painless to move into singleimage. I can give it a try.
Just to clarify, regex matching is working if it only matches one file, or failing? I tested this on my machine and it worked but it could be sensitive to the directory structure.
It is working. I grabbed a file name at random to try and didn't realize that it excludes dark images. That's a few hours of my life I'll never get back lol.
When trying to call integrateSingleImage directly, I run into an IndexError here. img.energy is of type xarray.core.dataarray.DataArray, not float, so the expression on line 41 will always evaluate to True. If you pass a single energy DataArray (e.g. imgstack.unstack('system').sel(energy=270)), the array is 0-dimensional and this is where the IndexError arises.
When PyHyper calls integrateSingleImage through integrateImageStack via groupby-progress_apply, img.energy is of length 1 (perhaps because it still has the system multi index?), so indexing its first value is no issue.
Long story short, I think we can get rid of the if-else statement on lines 41-44, and just replace it with
en = img.energy.values.ravel()[0]