Closed mjo22 closed 6 months ago
I feel like there's multiple factors missing. That said, it's a bit tricky to make sure we have them all without going into the math you are using for the detector model (I'm afraid I haven't been staying very caught up there). Moreover I'm afraid the linked CisTEM code wasn't helpful for me (maybe I was looking in the wrong places). Do you have a mathematical reference for your model you could share? If it helps to have something to refer to, I've been looking at Chapter 4.3 of the IUCr International Tables for Crystallography, Volume C, Chapter 4.3.
The error function is an interesting question. I think in the end it comes down to how finely we need to parameterize our volumes (in real space) due to other factors, e.g. to avoid spurious correlations as a function of rotation. I feel like we should get some numerical tests on the effect of resolution before we commit one way or the other.
The reference used for the cisTEM scattering potential seems to be the appendix of Vulovic et al 2013 (https://pubmed.ncbi.nlm.nih.gov/23711417/). I believe that the object we are after is $\sigma V_z(x, y, z)$, given in Eq. 22. We should carefully work this out...
Re, the error function and resolution: The reason it is important for cisTEM may be for the purpose of high-resolution 2D template matching. If this is true, this may become important for me. But as you say, I think it makes sense to make a small correction to the current code, run some tests, and then build from there!
Regardless, a parameter we may need to include is some kind of downsample factor. Namely, let's say we ask for a voxel size of 1.0 A but specify a downsample factor of 2. Then, create a voxel grid at 0.5 A and then downsample to 1.0 A. People seem to do this regardless of if they use the error function or a gaussian.
Hello, Kawamura here.
Just looking into this.
Regarding using the error function(s) instead a gaussian, I have looked into this numerically.
It's related to how small the pixel size is versus the decay length of the Gaussian. Think area-under-the-curve from integral calculus.
The Gaussian drops off to exp(-x^2/2) when we are x standard deviations away. That's exp(-4.5)=1% at 3 std!
So if the pixel size is too large, then the Gaussian decays to near zero. Hence we need to integrate the Gaussian density inside the pixel. Hence the (difference of) error function(s). You get a difference in the x and y direction, so err_1_x - err_2_x and err_1_y - err_2_y, so 4 error function evaluations.
So either you have very tiny pixels and
Here are some notes on Cowley and Peng https://github.com/mjo22/cryojax/issues/179
@jkawamura8 @ehthiede and @mjo22 After a few hours of making notes on the refs in #179, I have a concrete procedure to getting the units of a_i
and b_i
right:
multiply s^2 to b_i with the_right_factors.
sum up f_atom = sum_{i=1}^5 a_i exp(the_right_factors*b_i*s^2)
to get f^B for the right atom.
The potential distribution is defined from 2. in equation 4.3.1.32. This equation says the Fourier transform of the real spaced potential is \sum_atom f_atom exp(2 pi i h r_atom)
.
Consider the function evaluate_3d_real_space_gaussian
:https://github.com/mjo22/cryojax/blob/df1407634d546574fdac242e596eb2077cd29cbb/src/cryojax/simulator/_potential/_voxel_potential.py#L564 https://github.com/mjo22/cryojax/blob/df1407634d546574fdac242e596eb2077cd29cbb/src/cryojax/simulator/_potential/_voxel_potential.py#L589-L593 It takes the real spaced Fourier transform of 3. The factors needs to match with
a. The convention of the Fourier transform used
b. How you Fourier transform a Gaussian.
The absolute value of a_i factors can be determined by matching to some standard, as in Himes and Grigorieff (2021) Figures 2,3,4 (see below).
A suggested test for evaluate_3d_real_space_gaussian
is to take its Fourier transform (numerically, with the convention in cryojax) and then compare this to a hard coded 4.3.1.32 for a few atoms, with values lifted from table 4.3.1.32 at a specific resolution bin, i.e. sin(theta) or as a Gaussian in Fourier space from the gemmi (a_i,b_i) values.
I was reading more on this. To be clear, this issue is about implementing the right physical model of the object phase shift distribution, given its potential. Section 69.2.3 in the Hawkes and Casper book (see here) describes exactly this. Renaming this issue appropriately.
Usually in cryo-EM scattering theory, the phase shift distribution is taken to be proportional to the projected potential up to dimensional factors. This is equation 69.34(b) in the book and is exactly what cisTEM and other cryo-EM simulators use. We should analyze if this is indeed the right approximation to make.
There are two things to do to I think.
In PR #200 I've worked out a refactor that should allow us to implement a fix to this issue!
The big idea of it is that we need to make the wavelength a parameter visible to many parts of the modeling pipeline (see #196 for some discussion on this). This involved moving the wavelength upstream to the Instrument
.
Need to make sure that rasterized voxel grid scattering potentials have necessary factors out front so that their 2D projection images represent phase shifts in the exit plane. See function ReturnScatteringPotentialOfAVoxel here: https://github.com/timothygrant80/cisTEM/blob/master/src/core/scattering_potential.h
Also, there is a question in my mind of whether or not we should be using the error function to evaluate voxel grids at a given pixel. For reasons, see here: https://journals.iucr.org/m/issues/2021/06/00/rq5007/index.html