cryo-et-standards / projection-model

A standard cryo-ET projection model
9 stars 0 forks source link

Interchange format for projection / deformation models #1

Open dtegunov opened 7 months ago

dtegunov commented 7 months ago

Hey everyone!

I don't think there is much hope that any of us with skin in the game are going to agree to implement the same model, no matter what it is. Instead, I'd like to propose a way of communicating the effective behavior of any model in a model-agnostic way that is somewhat future-proof.

Each package should implement a command-line tool (and proactively generate the output of such a tool) that, for a set of 3D positions, writes out the corresponding sets of absolute 2D positions in a defined set of tilt images, defocus values, and Euler angles.

Basically, when given an evenly spaced 3D grid of positions, the outputs will be such that if you extract an image series for each and put it through relion_reconstruct (popular enough to count as a standard here) with the defocus values and Euler angles, combining the resulting reconstructions into a single volume should give you exactly the tomogram the original package would have reconstructed.

Any other package that wants to match the effective behavior of another package then takes these input/output pairs, and gradient-descends (or whatever it wants to do) the parameters of its own model to fit those pairs as well as it can. A perfect fit won't be possible between some models, but you'll know how imperfect it is based on the residuals. Each tool is free to request as many input/output pairs as it thinks are needed to fit its model well. If you can fit everyone else's model behavior – great. If you come up with a new idea that can't be fitted well by other models – they better catch up, or be left reporting large residuals.

This isn't elegant, but I think this brute-force approach is the most effective and unrestrained way of communicating our models that we can realistically agree on.

The input positions should be defined in Angstrom, relative to the volume center, like the rlnCenteredCoordinate(X/Y/Z)Angst convention Sjors @scheres introduced in RELION recently. This avoids the usual scenario of everything breaking when the tomogram dimensions change. Let's rename the labels to centeredCoordinate(X/Y/Z)Angst to avoid any branding. I think the produced 2D positions should also be in a centered coordinate frame since there are additional things like cropping that some packages may be applying that change the image dimensions.

Hopefully, your raw data are tilt movies, so those need their own similar export to communicate the deformation model behavior.

Here is an example with 2 positions for brevity, but you'd usually want to sample a larger grid of positions:

inputs.star:

data_points
loop_
_centeredCoordinateXAngst #1
_centeredCoordinateYAngst #2
_centeredCoordinateZAngst #3
  -50  0  0
   50  0  0

Produces this output for one tilt series:

data_points
loop_
_centeredCoordinateXAngst #1
_centeredCoordinateYAngst #2
_centeredCoordinateZAngst #3
  -50  0  0
   50  0  0

data_tilts
loop_
_tiltID #1
_imageName #2
_voltage #3
_cs #4
_phaseShift #5
   1  2Dvs3D_53-1_00041_-40.0_Jul31_11.05.12.tif  300.00  2.6353  0.00
   2  2Dvs3D_53-1_00040_-38.0_Jul31_11.04.30.tif  300.00  2.6353  0.00
...
  40   2Dvs3D_53-1_00038_38.0_Jul31_11.03.02.tif  300.00  2.6353  0.00
  41   2Dvs3D_53-1_00039_40.0_Jul31_11.03.42.tif  300.00  2.6353  0.00

data_mappings
loop_
_pointID #1
_tiltID #2
_centeredCoordinateXAngst #3
_centeredCoordinateYAngst #4
_angleRot #5
_angleTilt #6
_anglePsi #7
_defocusU #8
_defocusV #9
_defocusAngle #10
  1   1    10.1650    -8.9607     1.0818  39.3966   84.7950  28759.5371  28711.2832  25.2541
  1   2     5.9653    13.5441     0.3504  37.6814   84.1006  27043.9766  27025.8047  24.2749
...
  2  40   -55.9080   -26.0011  -179.4332  38.5334  -95.2667  21816.5391  21858.2207  23.5494
  2  41    57.1521   -89.1005   179.9853  39.5104  -93.8842  23576.7344  23772.4277  21.6167
dtegunov commented 7 months ago

I went ahead and implemented this as a tool in Warp for tilt series. In addition to accepting a STAR file with positions, it can generate an evenly spaced grid with a custom extent and spacing.

scheres commented 7 months ago

Dear Dimitry,

Thank you very much for thinking along with this! Would angleRot, angleTilt and anglePsi be the most logical basis for a common exchange mechanism? Although commonly used in SPA and sub-tomogram averaging, as far as I am aware, no one uses this angular convention in tomographic reconstruction. At the cryo-ET standards workshop there was some appetite for exchanging 3x4 transformation matrices, although others also argued against this. I'm not sure what route to take yet, but just thought to mention this discussion.

Best wishes,

Sjors

On 4/26/24 04:30, Dimitry Tegunov wrote:

CAUTION: This email originated from outside of the LMB: @.** Do not click links or open attachments unless you recognize the sender and know the content is safe. If you think this is a phishing email, please forward it to **@.***

--

Hey everyone!

I don't think there is much hope that any of us with skin in the game are going to agree to implement the same model, no matter what it is. Instead, I'd like to propose a way of communicating the effective behavior of any model in a model-agnostic way that is somewhat future-proof.

Each package should implement a command-line tool (and proactively generate the output of such a tool) that, for a set of 3D positions, writes out the corresponding sets of absolute 2D positions in a defined set of tilt images, defocus values, and Euler angles.

Basically, when given an evenly spaced 3D grid of positions, the outputs will be such that if you extract an image series for each and put it through relion_reconstruct (popular enough to count as a standard here) with the defocus values and Euler angles, combining the resulting reconstructions into a single volume should give you exactly the tomogram the original package would have reconstructed.

Any other package that wants to match the effective behavior of another package then takes these input/output pairs, and gradient-descends (or whatever it wants to do) the parameters of its own model to fit those pairs as well as it can. A perfect fit won't be possible between some models, but you'll know how imperfect it is based on the residuals. Each tool is free to request as many input/output pairs as it thinks are needed to fit its model well. If you can fit everyone else's model behavior – great. If you come up with a new idea that can't be fitted well by other models – they better catch up, or be left reporting large residuals.

This isn't elegant, but I think this brute-force approach is the most effective and unrestrained way of communicating our models that we can realistically agree on.

The input positions should be defined in Angstrom, relative to the volume center, like the rlnCenteredCoordinate(X/Y/Z)Angst convention Sjors @scheres https://github.com/scheres introduced in RELION recently. This avoids the usual scenario of everything breaking when the tomogram dimensions change. Let's rename the labels to centeredCoordinate(X/Y/Z)Angst to avoid any branding. I think the produced 2D positions should also be in a centered coordinate frame since there are additional things like cropping that some packages may be applying that change the image dimensions.

Hopefully, your raw data are tilt movies, so those need their own similar export to communicate the deformation model behavior.

Here is an example with 2 positions for brevity, but you'd usually want to sample a larger grid of positions:

inputs.star:

|datapoints loop _centeredCoordinateXAngst #1 _centeredCoordinateYAngst #2 _centeredCoordinateZAngst #3 -50 0 0 50 0 0 |

Produces this output for one tilt series:

|datapoints loop _centeredCoordinateXAngst #1 _centeredCoordinateYAngst #2 _centeredCoordinateZAngst #3 -50 0 0 50 0 0 datatilts loop _tiltID #1 _imageName #2 _voltage #3 _cs #4 _phaseShift #5 1 2Dvs3D_53-100041-40.0_Jul31_11.05.12.tif 300.00 2.6353 0.00 2 2Dvs3D_53-100040-38.0_Jul31_11.04.30.tif 300.00 2.6353 0.00 ... 40 2Dvs3D_53-1_00038_38.0_Jul31_11.03.02.tif 300.00 2.6353 0.00 41 2Dvs3D_53-1_00039_40.0_Jul31_11.03.42.tif 300.00 2.6353 0.00 datamappings loop _pointID #1 _tiltID #2 _centeredCoordinateXAngst

3 _centeredCoordinateYAngst #4 _angleRot #5 _angleTilt #6 _anglePsi

7 _defocusU #8 _defocusV #9 _defocusAngle #10 1 1 10.1650 -8.9607

1.0818 39.3966 84.7950 28759.5371 28711.2832 25.2541 1 2 5.9653 13.5441 0.3504 37.6814 84.1006 27043.9766 27025.8047 24.2749 ... 2 40 -55.9080 -26.0011 -179.4332 38.5334 -95.2667 21816.5391 21858.2207 23.5494 2 41 57.1521 -89.1005 179.9853 39.5104 -93.8842 23576.7344 23772.4277 21.6167 |

— Reply to this email directly, view it on GitHub https://github.com/cryo-et-standards/projection-model/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFOHJCPAHJGSG25PBUXVF53Y7HC5XAVCNFSM6AAAAABG2AOLC6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3DIOBYHA2DQOI. You are receiving this because you were mentioned.Message ID: @.***>

-- Sjors Scheres MRC Laboratory of Molecular Biology Francis Crick Avenue, Cambridge Biomedical Campus Cambridge CB2 0QH, U.K. tel: +44 (0)1223 267061 http://www2.mrc-lmb.cam.ac.uk/groups/scheres