ml-struct-bio / cryodrgn

Neural networks for cryo-EM reconstruction
http://cryodrgn.cs.princeton.edu
GNU General Public License v3.0
293 stars 74 forks source link

Suggestions wanted for cryoDRGN map and plot visualization in ChimeraX #134

Open tomgoddard opened 1 year ago

tomgoddard commented 1 year ago

I'm making a cryoDRGN visualization tool in ChimeraX and am interested in any suggestions users have about what it should do. So far it shows the umap plot and the maps computed by "cryodrgn analyze" on as points that plot and you can click on the points to see the map in the 3D view. You can cycle through the precomputed maps with a slider. I think it will be nice to allow computing new maps by clicking a point on the plot and morph between pairs of precomputed maps, and maybe make movies along paths drawn on the plot. Please add comments if you have other suggestions. Thanks!

cryodrgn_viewer
priiteek commented 1 year ago

This looks awesome! I would also implement sliding/morphing along PCs, not just the 20 k-means models. That can be really helpful in some cases to make sense of the data. And also parsing the results from analyze_landscape, which is really useful for continuous heterogeneity.

zhonge commented 1 year ago

@tomgoddard this is fantastic! Thank you!

I agree with @priiteek; it would also be great to view the PC trajectories and even the cryodrgn analyze_landscape results, if possible.

For viewing the PCA trajectories, it might be helpful to view the PC embeddings in the right panel (instead of the UMAP). They are currently not saved, but I can modify cryodrgn analyze to save the PC embeddings, e.g. z_pca.pkl. Let me know if you would like me to add this (or happy to approve a pull request).

cryodrgn analyze_landscape is still an experimental feature (WIP documentation here), but it is similar to cryodrgn analyze in that it also generates a set of volumes (i.e. "conformational states"), smooth interpolations, and associated embeddings ("volume PCA"). It might be straightforward to support iterating through maps from cryodrgn analyze_landscape on the left with some sort of embedding visualization on the right. Though, there would (or could) be an additional layer of data visualization from the volume sketching/clustering. I'm happy to provide more documentation or clarify any of the conventions.

I can't tell from the screenshot, but it would be useful to have the latent space point associated with the viewed density map(s) highlighted in some way.

Thanks again!

tomgoddard commented 1 year ago

I will take a look at analyze_landscape output. I have not tried that yet.

Showing a principle component plot and allowing morphs along those axes could be done. It would probably be best if I read the PCA axes and/or particle coordinates from a file. I could recompute them but the result might be different (for instance the axis direction might be flipped.

I was thinking about highlighting the point on the umap plot (e.g. in bright green) that corresponds to the displayed map. Right now I have the maps all the same color (gray) -- that is a bit less jarring when flipping between them. But if they were different colors, the markers on the umap plot could match the color.

It may take me a few days to get back to looking at cryoDRGN. I had a hard time running it on the empiar 10076 example data with NaN values in the downsampled images and in the loss values that took half a day to discover my Samsung 870 QVO 4TB ssd drive was giving sporadic wrong data when reading. So I am working on testing and probably replacing that drive on our main machine learning computer.

charbj commented 1 year ago

Hi Tom,

I have a similar project - it might be worth merging our efforts. There is plenty of room for improvement in my code (refactoring, optimisation, modularity, readability, etc). I am still actively testing and debugging...

https://github.com/charbj/wiggle

tristanic commented 1 year ago

This looks awesome! For a long time I've been interested in seeing how ISOLDE could interact with CryoDRGN maps (basically, starting from a single well-refined model against the highest resolution conformation, then having it follow the map reconstructions from sensible trajectories through the latent space). Needed two key pieces of functionality to make that happen: (a) the ability to update map data in OpenMM simulations on the fly (available in the latest OpenMM version; a little fairly straightforward work needed to make use of it in ISOLDE), and (b) the ability to actually generate reconstructions as needed in ChimeraX. Looks like that's well on the way now - exciting!

tomgoddard commented 1 year ago

Hi Charles, Wiggle looks nice, I will try it soon. I agree we should combine efforts if we have the same goals. I should have searched for cryoDRGN / ChimeraX visualization before I started. But I have only put in a half-day on my prototype code. I am the UCSF ChimeraX developer working on machine learning, cryoEM and virtual reality, so if there is anything you need in ChimeraX to make Wiggle work better I am happy to help. Tom

tomgoddard commented 1 year ago

Hi Charles, I watched your 4 Wiggle tutorial videos on YouTube. Fantastic! I'm going to create a few issues on the Wiggle Github to discuss how to get everyone using it. I work on the ChimeraX cryoEM tools as part of the UCSF ChimeraX team and will help however I can.