geohot / corona

Reverse engineering SARS-CoV-2
2.49k stars 478 forks source link

Multe sequence compare tool and secondary structure prediction. #6

Open michaelSkaro opened 4 years ago

michaelSkaro commented 4 years ago

Work to be done section:

-Multi-sequence compare tools from the broad institute. IGV: https://software.broadinstitute.org/software/igv/download

Some good command line softwares you will inevitably run into(conda/pip installable): Bamtools, Samtools, clustalo, blast

-Secondary structure prediction: This should be completed using the RNA transcript, protein prediction is still in its infancy because no one has taken post translational mods into account. Some papers to get you started:

PTMs in coronavirus: https://www.futuremedicine.com/doi/full/10.2217/fvl-2018-0008

Ponti, R. D., et al. (2020). "CROSSalive: a web server for predicting the in vivo structure of RNA molecules." Bioinformatics 36(3): 940-941.

Wang, F. Q., et al. "Comparison of Pseudoknotted RNA Secondary Structures by Topological Centroid Identification and Tree Edit Distance." Journal of Computational Biology.

Zhang, Z., et al. (2020). "Accurate inference of the full base-pairing structure of RNA by deep mutational scanning and covariation-induced deviation of activity." Nucleic Acids Research 48(3): 1451-1465.

If you are certain you want to stay in protein prediction: PDB and exPasy- Prosite can be helpful databases. The PRATT function on expasy is super useful.

Have fun!!!! MS

eds000n commented 4 years ago

I'd add this book as a useful resource: Introduction to computational molecular biology

gosuto-inzasheru commented 4 years ago

https://www.nature.com/articles/s41591-020-0820-9