Open LouConreux opened 4 months ago
The objective of this project is to eventually build a real-time Geometry Optimization that can be summarized in this big picture: This workflow relies on the development of several processes:
Geometry conversion: transforming either a .geom or .data file in a pyFAI-friendly geometry instance.
The pyFAI optimizer that takes in input keV calibrated images and the detector geometry to optimize.
Transforming back the final pyFAI-ed geometry in the LCLS reference frame coordinates.
Deploying the calibrated geometry.
is already working, 2. is in development, 3. needs to be implemented, and 4. is already working.
An example tutorial of how pyFAI can be used to calibrate X-ray powder diffraction can be found [here].(https://pyfai.readthedocs.io/en/v2023.1/usage/tutorial/Recalib/Recalib_notebook.html) This script relies on three major functions:
dist = 0.42
poni1 = 0
poni2 = 0
geom_initial = pyFAI.geometry.Geometry(dist=dist, poni1=poni1, poni2=poni2, detector=epix10k2M, wavelength=wavelength)
sg = SingleGeometry("initial", calib_max_flat, calibrant=behenate, detector=epix10k2M, geometry=geom_initial)
sg.extract_cp(max_rings=5, pts_per_deg=2, Imin=0*photon_energy)
From these control points, the score function to minimize is defined as:
where theta_e depends on the 6 geometry parameters and the control points, and the theta_i are defined with the calibrant and wavelength data.
From pyFAI:
def residu3(self, param, free, const, d1, d2, rings, weights=None):
"Preform the calculation of $sum_(2\theta_e-2\theta_i)²$"
param7 = self.calc_param7(param, free, const)
delta_theta = self.tth(d1, d2, param7[:6]) - self.calc_2th(rings, param7[6])
if weights is not None:
delta_theta *= weights
return numpy.dot(delta_theta, delta_theta)
The optimization relies on the scipy.optimize function called fmin_slsqp
source
which minimize the score function using Sequential Least Squares Programming
Starting from the guessing point dist=0.42, poni1=0pix, poni2=0pix, we extract control points and maps the score function over a 2D subspace. If we refine the mesh: Thus, the optimization will not lead directly to global minimum which is around (poni1=5pix, poni2=0.5pix) as seen previously. But at least the optimization is indicating the right direction. Hence, repeating the optimization until we reach convergence should work.
Here, we highlight the fact that in order to reach the global minimum, the control points have to be updated in between two iterations. We can see now the global minimum region if we map this new process: Repeating the optimization loop a finite number of times should lead to convergence.
Let's return to our first example with guessing point dist=0.28mm, poni1=0pix, poni2=0pix. From this, we plot the calibrant data over the powder data. We can see that the distance is way off the true value. In fact the optimal is more about dist=0.42. Hence, there is an offset of 0.14mm which turns out to be correponding to the camera offset of the experiment!
We can set it to 5 because after the 5th ring, we don't see that much the rings on the powder.
Hence for dist=0.42mm, poni1=0pix, poni2=0pix, if we zoom in one of the center panel:
We can see that the calibrant ring are not perfectly centered, so we extract points to perform the optimization. We can play on the pts_per_deg
hyperparameter to refine the extraction (here 0.5, 1, 2 respectively).
Thus, it's preferable to have pts_per_deg
large enough to have an accurate optimization, but not too large so the computation remains reasonably long. But what's problematic is that the extracted points are not fitting nicely the rings, the corresponding widths are too big. This could explain why only one iteration of the optimization is not immediately converging to the global minimum. So let's play now on Imin.
Plotting the same with Imin=0, 1, 2 and finally 3*photon_energy, we can filter out the background pixels. It's still not improving that much the extraction, the noisy component is too high. Maybe it will need require some preprocessing on the powder data because it should fit nicely the rings like in the example:
A first idea solution can be investigated through repeating those operations: https://pyfai.readthedocs.io/en/v2023.1/geometry_conversion.html#geometry-definition-of-pyfai
A quick experiment was performed to measure the influence of the quality of control points on convergence. We start from a guessed geometry way off the true one:
Detector Epix10k2M PixelSize= 1.000e-04, 1.000e-04 m BottomRight (3)
Wavelength= 1.290991e-10 m
SampleDetDist= 4.409220e-01 m PONI= 1.000000e-03, 1.000000e-03 m rot1=0.000000 rot2=0.000000 rot3=0.000000 rad
DirectBeamDist= 440.922 mm Center: x=10.000, y=10.000 pix Tilt= 0.000° tiltPlanRotation= 0.000° 𝛌= 1.291Å
We then extract the Control Points with:
sg.extract_cp(max_rings=5, pts_per_deg=1, Imin=0)
We get a lot of control points:
ControlPoints instance containing 5 group of point:
AgBh Calibrant with 49 reflections at wavelength 1.2909905595344899e-10
Containing 5 groups of points:
#at ring 0: 220 points
#au ring 1: 329 points
#av ring 2: 341 points
#aw ring 3: 347 points
#ax ring 4: 350 points
However, this results in poor quality control points which obviously lead to a suboptimal new geometry:
Detector Epix10k2M PixelSize= 1.000e-04, 1.000e-04 m BottomRight (3)
Wavelength= 1.290991e-10 m
SampleDetDist= 4.399303e-01 m PONI= 9.736744e-04, 9.835861e-04 m rot1=0.000000 rot2=0.000000 rot3=0.000000 rad
DirectBeamDist= 439.930 mm Center: x=9.836, y=9.737 pix Tilt= 0.000° tiltPlanRotation= 0.000° 𝛌= 1.291Å
Since the noise component on the powder is very predominant, we need to refine the selection of points based on their intensity:
sg.extract_cp(max_rings=5, pts_per_deg=1, Imin=4*photon_energy)
Then, we get obviously a lot less of control points, but they are of definitely much better quality.
ControlPoints instance containing 5 group of point:
AgBh Calibrant with 49 reflections at wavelength 1.2909905595344899e-10
Containing 5 groups of points:
#ay ring 0: 83 points
#az ring 1: 5 points
#ba ring 2: 3 points
#bb ring 3: 0 points
#bc ring 4: 0 points
Giving this to the optimizer, we are able to make a much better significant improvement in a single step of optimization:
Detector Epix10k2M PixelSize= 1.000e-04, 1.000e-04 m BottomRight (3)
Wavelength= 1.290991e-10 m
SampleDetDist= 4.193749e-01 m PONI= 4.912165e-04, 1.039317e-04 m rot1=0.000000 rot2=0.000000 rot3=0.000000 rad
DirectBeamDist= 419.375 mm Center: x=1.039, y=4.912 pix Tilt= 0.000° tiltPlanRotation= 0.000° 𝛌= 1.291Å
From this, we can deduce that the noise affects too much the extraction of high quality control points, thus leading to an ill-defined minimization problem.
A first idea solution can be investigated through repeating those operations: https://pyfai.readthedocs.io/en/v2023.1/geometry_conversion.html#geometry-definition-of-pyfai
The geometry of an experiment can be defined using different frames. The first one involved is the psana frame used in metrology: the different translations x, y, z and rotation angles are measured and entered in .data file. Then those metrology data can be converted in Crystel format .geom file. At this point, those data can be chosen to be formatted in the CrystFEL format while staying in the psana frame or they could be transformed in the CrystFEL reference frame and format. Finally, pyFAI uses another system of coordinates. This update describes all three of them
Following CrystFEL docs, CrystFEL's laboratory coordinate system defined as follows: +z is the beam direction, and points along the beam (i.e. away from the source) +y points towards the zenith (ceiling). +x completes the right-handed coordinate system. So, (z, y, x) is a direct reference frame, while traditional (x, y, z) is in fact indirect.
This is the coordinate system used in metrology measurements. According to the geometry_converter code, the psana coordinate system can be transformed into the CrystFEL coordinate system by doing those operations: Hence, in this system: +z is the opposite beam direction +y points from right to left horizontally +x points towards the ground
PyFAI's coordinate system is defined as such: Hence, in this system: +z is along the beam +y points towards the zenith +x points from left to right horizontally
To sum up, these are the three reference frames used and their respective translations:
From the psana .data file of exp mfxx49820, the beam center is shifted by X0=-600µm and Y0=-1900µm in psana reference frame.
Thus, we would expect that the PONI being initiated at the center of the geometry would then be in pyFAI coordinates: poni1 = 600µm = 6pix poni2 = 1900µm = 19pix
This is indeed what we found after appropriate conversion!
This frame definition took longer than expected because:
The geometry of an experiment can be defined using different frames. The first one involved is the psana frame used in metrology: the different translations x, y, z and rotation angles are measured and entered in .data file. Then those metrology data can be converted in Crystel format .geom file. At this point, those data can be chosen to be formatted in the CrystFEL format while staying in the psana frame or they could be transformed in the CrystFEL reference frame and format. Finally, pyFAI uses another system of coordinates. This update is correcting the previous one.
This is the coordinate system used in metrology measurements. According to the geometry_converter code, the psana coordinate system can be transformed into the CrystFEL coordinate system by doing those operations: Hence, in this system: +z is the opposite beam direction +y points from right to left horizontally +x points towards the ground
This coordinate system is convenient for calibrating the geometry since the pixel space in matplotlib definition has the same axis as psana. The top left corner of an image is the origin of the pixel space, the row-space being the vertical axis (i.e. the first dimension of an array) which in psana is the x-coordinate and the column-space being the horizontal axis (i.e. the second dimension of an array) which in psana is the y-coordinate.
Following CrystFEL docs, CrystFEL's laboratory coordinate system defined as follows: +z is the beam direction, and points along the beam (i.e. away from the source) +y points towards the zenith (ceiling). +x completes the right-handed coordinate system. So, (z, y, x) is a direct reference frame, while traditional (x, y, z) is in fact indirect.
However, LCLS has chosen another reference for the writing of CrystFEL geometry file .geom. In this system: +z is the beam direction, and points along the beam +y points towards the zenith (ceiling) +x is defined going from right to left So, (x, y, z) becomes a direct reference frame as well as (z, x, y)
PyFAI's coordinate system is defined as such: Hence, in this system: +z is along the beam +y points towards the zenith +x points from left to right horizontally
To sum up, these are the three reference frames used and their respective translations:
From the psana .data file of exp mfxx49820, the beam center is shifted by X0=-600µm and Y0=-1900µm in psana reference frame.
Thus, we would expect that the PONI being initiated at the center of the geometry would then be in pyFAI coordinates: poni1 = 600µm = 6pix poni2 = -1900µm = -19pix
This is indeed what we found after appropriate conversion both from psana frame or LAB frame!
To go from one coordinate system to another, we have already at our disposal some conversion scripts.
This function implemented by Mikhail Dubrovin is able to convert a psana .data file into a CrystFEL format either in psana coordinate system or in LAB coordinate system. This latter conversion option is dictated by the cframe argument: cframe=0 for to psana coordinates or cframe=1 for to LAB coordinates. This function can be called from the terminal as described here. Besides, this function relies mainly on one function from the PSCalib repository called _geometry_tocrystfel described here. The conversion of coordinate systems is coherent with the definition of the different reference frames above.
To be able to instantiate a pyFAI Detector object, one needs to provide it a flattened shape (Nmods dimslow scan, dimfast scan) alongside with a pixel corner array corresponding of the (z, y, x) (respectively z along the beam, y the fast scan dimension, x the slow scan dimension) coordinates of each 4 pixel corners. This pixel corner array is then of shape (Nmods dimslow scan, dimfast scan, 4, 3).
This script is particular taken from the _geomopt.py script on btx repository. From a given PSCalib.GeometryAccess object, it can generate both a .data file and a .geom file. For that, it relies on two functions:
Therefore, I would recommend not relying on this function and sticking to the well documented _geometryconvert function.
Following analysis #197, it seems that the true center of the rings is:
New center is (843.5532519182825, 831.0710902924419)
Detector distance inferred from powder rings: 280.47 mm
Then, when calling _deploygeometry:
def deploy_geometry(self, outdir, pv_camera_length=None):
"""
Write new geometry files (.geom and .data for CrystFEL and psana respectively)
with the optimized center and distance.
Parameters
----------
outdir : str
path to output directory
pv_camera_length : str
PV associated with camera length
"""
# retrieve original geometry
run = self.diagnostics.psi.run
geom = self.diagnostics.psi.det.geometry(run)
top = geom.get_top_geo()
children = top.get_list_of_children()[0]
pixel_size = self.diagnostics.psi.get_pixel_size() * 1e3 # from mm to microns
# determine and deploy shifts in x,y,z
cy, cx = self.diagnostics.psi.det.point_indexes(run, pxy_um=(0,0), fract=True)
dx = pixel_size * (self.center[0] - cx) # convert from pixels to microns
dy = pixel_size * (self.center[1] - cy) # convert from pixels to microns
dz = np.mean(-1*self.diagnostics.psi.det.coords_z(run)) - 1e3 * self.distance # convert from mm to microns
geom.move_geo(children.oname, 0, dx=-dy, dy=-dx, dz=dz)
We can see that a PSCalib.GeometryAccess object is instantiated through the PsanaInterface object. This allows an easy manipulation of the geometry through low-level functions like _geom.movegeo or _geom.tiltgeo. The shift to be applied on the uncalibrated geometry is then defined as the difference in (x, y) coordinates between the true center in pixel and the current center of the detector in pixel _(cy, cx) = psi.det.point_indexes(run, pxyum=(0,0), fract=True) (this returns the pixel indexes of the point (0,0) in psana or LAB coordinates).
So starting from a 0-end.data file as such:
IP 0 CAMERA 0 -600 -1900 -300000 90 0 0 0.00000 0.00000 0.00000
We get an optimized _r0008end.data file as such:
IP 0 CAMERA 0 237 -1183 -282388 90 0 0 0.00000 0.00000 0.00000
Now let's draw a figure to understand the difference between looking for the optimal center of the rings and the optima PONI pyFAI coordinates: Since the definition of PONI is relative to the geometry origin we are providing it, hence the optimal PONI found is actually the shift we need to perform on the uncalibrated geometry. Hence, starting from a 0-end.data file as such:
IP 0 CAMERA 0 -600 -1900 -300000 90 0 0 0.00000 0.00000 0.00000
We get an optimized _r0008end.data file as such:
IP 0 CAMERA 0 188 -1234 -282388 90 0 0 0.00000 0.00000 0.00000
!!!
The final endpoint of this analysis is to provide a central framework for dealing and handling geometry files for the calibration of the different LCLS expriments (mfx, mec, cxi...). This will help design a uniform geometry optimization workflow that aims to be applied for the calibration of all the different LCLS detectors (ePix10k2M, Jungfrau4M, ePix10ka...). This workflow will look like this:
Let's go into details:
A geometry calibration starts with a psana .data file and is supposed to end with an optimal psana .data file. But pyFAI cannot comprehend this format, hence we need to have uniform scripts that are able to convert any geometry file format into another one, without affecting the geometry data. This will be handled by built-in functions from the PSCalib repository (.data to .geom and vice-versa) as well as some hand-coded functions (.geom to pyFAI format and vice-versa).
PyFAI optimization will likely to need some preprocessing of the powder (as discussed in here: ) explaining why I am regrouping these two in one sub-category.
PyFAI provides a potential robust and fast geometry calibration workflow that was also described in here. PyFAI takes as an input a flattened pixel corner array (Nmods x dim(ss), dim(fs), 4, 3) where Nmods is the number of panels that the detector has, dim(ss) is the number of pixels in the slow-scan direction, dim(fs) the number of pixels in the fast-scan direction, 4 because a pixel has 4 corners, and 3 for the 3 coordinates. This pixel corner array is then used to instantiate a pyFAI Detector object where the optimization can be done given a powder data.
Once the optimization is done, pyFAI will give 6 optimal parameters: dist: distance IP-PONI where PONI is the Point of Normal Incidence. poni1: slow-scan coordinate of PONI poni2: fast-scan coordinate of PONI rot1: rotation over slow-scan direction rot2: rotation over fast-scan direction rot3: rotation over beam direction
Nota Bene These parameters in fact define the translations and rotations that need to be applied to correct the current detector frame.
According to pyFAI Geometry Definition, given those parameters, a pixel coordinate (x,y,z) in the detector frame can be translated into (X,Y,Z) coordinates in the chosen reference frame following this math:
So, by applying first the translations and then the rotations onto the current (x,y,z) coordinates, we can find the true (X,Y,Z) pixel coordinates. From that, we can write a script to write a CrystFEL format .geom file and finally convert that into an optimized psana .data file.
Following a small-science discussion, analysis of mecl1016522 run 8 was performed to determine the geometries of the 4 different ePix10ka detectors. LaB6 was the calibrant in this experiment.
By analyzing the unassembled powder:
A first geometry guess was:
# Estimating initial geometry
poni1 = p1.mean()+525*0.0001
poni2 = p2.mean()+250*0.0001
dist = 0.14
Giving a first guess decent radial integration:
We then optimize without and with rotations and compare the resulting radial integrations:
We see little difference between those two. Indeed, pyFAI finds only small tilts in the rotation case, indicating that Quad 1 is relatively orthogonal to the beam direction.
Looking at Quad 0, it seems to be slightly tilted, so we will focus on optimizing Quad 0 with rotations.
A first geometry guess was:
# Estimating initial geometry
poni1 = p1.mean()+510*0.0001
poni2 = p2.mean()+1180*0.0001
dist = 0.1329
Giving a first guess decent radial integration:
A first optimization shot was performed with the usual parameters for control points extraction:
# Extracting Control Points
sg0.extract_cp(max_rings=5, pts_per_deg=2, Imin=np.max(calib_max_flat)/100)
The resulting optimized geometry is really off as the radial integration indicates:
This can be explained by a poor quality control points extraction:
We can see that control points were only extracted on the lower left asic, leading to a bad fit:
Hence, we change the parameters of control point extraction to include more significant data in other asics:
# Extracting Control Points
sg0.extract_cp(max_rings=5, pts_per_deg=2, Imin=np.max(calib_max_flat)/400)
Optimization now leads to a way better radial integration:
Indeed, the control points were slightly more significant:
Thus, the fit is not perfect but still better than the previous one:
That's an example of why it is still unclear how to find an automated way of extracting control points. This could be done by some preprocessing of the powder beforehands (such as thresholding).
Following last update, a new strategy has been developed for geometry calibration of experiments. This new strategy has been tested on mfxx49820 run 8 as well as mecl1016522 run 8. Since geometries in those experiments were rather flat, verification of the geometry calibration workflow will need some test on real tilted geometries to confirm the robustness of the workflow in the case where rotations truly matter.
Some points will be first discussed and then the overall workflow will be demonstrated in the case of the two previously mentionned experiments.
When converting a psana .data file, the choice has been made to take into account the small tilt angles around x, y and z. This results in having a non-flat detector, i.e. the z-component is not a constant.
Then, when it comes to writing a corresponding crystFEL .geom file, the coffset parameter is therefore not a constant, the _geometryconvert function is able to write that accordingly.
To ensure the whole loop makes sense, when writing the pyFAI corner arrays, choice has been made to center the z-component around z=0. This ensures keeping the fact that the detector is not flat, while making sure pyFAI is not troubled by an off-set in the z-component, meaning pyFAI will find the true distance between the PONI and the origin.
This far, zeroing-out the mean of the z-component has worked fine and has provided a robust way of treating different detector geometries.
While trying to solve a way of a writing new .geom files from the PONI parameters when rotations are not fixed to 0, it came to me that I needed to do some trigonometry in order to express the center of rings from the PONI parameters. Indeed, when calibrating, the geometry is sent to the center of the rings, not the PONI. Hence, when there are no rotations, center of rings = PONI, so no need to bother with that. But, when rotations come in, the PONI can be way-off the center of rings in terms of pixels (depending on the distance and rotations as we will see).
So, here is a scheme to explain the importance of differentiating PONI and center of the rings when in presence of rotations:
On this image, we can clearly see a shift between the PONI and the center of the rings, a little trigonometry then helps us to find the formula to express this center in terms of the PONI parameters:
This correction has been implemented in the function PONI_to_center inside the PyFAI_to_CrystFEL class.
This is what pyFAI found after optimization:
1 - Panels
2 - Radial Integration
After writing the new geom file:
1 - Panels
2 - Radial Integration
This is what pyFAI found after optimization:
1 - Panels
2 - Radial Integration
After writing the new geom file:
1 - Panels
2 - Radial Integration
Analysis of PyFAI geometry optimization
Context
Following the geometry calibration on experiment mfxx49820 run 8 (see #197), we would like to translate the custom-made geometry optimization with a more-globally used library pyFAI. Using pyFAI would allow more robustness and rapidity in finding the optimal geometry of an experiment.
Geometry format
LCLS is using the CrystFEL format, so first one need to parse the CrystFEL geometry in a format that PyFAI can digest. Then optimizing through PyFAI reduces to finding the optimal 6 parameters defining the coordinates of the detector reference frame compared to the origin being the sample position (3 translations and 3 rotations). See https://www.silx.org/doc/pyFAI/dev/geometry.html#geometry for more details. In the following, we will then try to optimize the geometry by playing on the 3 translation parameters: poni1, poni2 and dist (PONI standing for "Point of Normal Incidence" = the orthogonal projection of the sample position onto the detector plane) (We make the assumption that we are in a transmission setting meaning all rotations are 0).
Optimizing the Geometry
Ground Truth
To set the ground truth for making sure PyFAI finds the right geometry, we refer to the analysis of #197 where the center (in pixels) of the geometry was found to be
with all rotations being 0.
In PyFAI reference frame, this coordinate corresponds to approximately (x=0.5, y=5)
Optimization
PyFAI struggles to find the optimal geometry with only one shot. So, we will apply the optimization recursively until convergence.
Initialization
Geometry Format Conversion
First, need to convert the CrystFEL format into a PyFAI-friendly format:
Load Data
Load the calibrated images:
Here, we need to reshape the _calibavg (in (Np, Nx, Ny)) to the shape supported by PyFAI *(NpNx, Ny)**
Set Experiment Initial Conditions
Define the guessed dist, poni1, poni2 and the wavelength used.
Calling the Geometry fitting
Here, we voluntarily defined parameters to be way off the true center. We get:
Instantiating the PyFAI Calibrant
Here, this was Silver Behenate:
Optimization Loop
Below is what the optimization looks like. First, it instantiated the guessed geometry. Then it passes through a SingleGeometry object that extracts control points between raw data and calibrant data. Using those control points, a new fitting is finally set and we repeat until convergence.
Results
For n=5 loops, we obtained the following convergence:
This pretty good for a start, not so far away from the ground truth center. However, PyFAI fails to find the best sample-detector distance which is dist=280mm. We can see that the more loops we perform, the more control points fit right the rings, which helps reaching convergence.
Next Steps
Finding why dist cannot be optimized ?
Finding best hyperparameters for _extractcp
In this optimization, the most crucial function is _extractcp which automatically extract control points on rings. We can play on the number of points we want on 3 hyperparameters: _maxrings selects the furthest rings where to extract control points _pts_perdeg selects the "spacing" or "resolution" between two extracted control points Imin specifies a minimum intensity for which points below that intensity cannot be extracted
Testing robustness
Playing with different initialization conditions, setting the guessed geometry way way too far from the ground truth may result in divergence (maybe too few control points).