marcellszi / rna3db

A dataset for training and benchmarking deep learning models for RNA structure prediction
MIT License
40 stars 4 forks source link

Secondary structure? #10

Closed erik-whiting closed 4 months ago

erik-whiting commented 4 months ago

Hello, thank you for providing this database as an open source repository, it's very helpful.

Is there a way to retrieve secondary structure information about a chain? I do see information like this in the .o file:

image

But I am not familiar with this notation (e.g., (((((((,,<<<<___.____>>>>,<<<<<_______>>>>>,,,.,<<<<<_______>>>>>))))))):) Is the two-dimensional structure recorded anywhere in the release?

marcellszi commented 4 months ago

Hi @erik-whiting,

Thanks for your kind words and interest in RNA3DB.

The notation is “WUSS notation” (Washington University Secondary Structure notation). For a detailed description, please see pg. 107 of the INFERNAL User’s Guide.

Please note that we do not explicitly include secondary structure annotation as part of RNA3DB, and we currently have no immediate plans to include it. The structures that you find in the .o files are Infernal's annotation found via structural alignment. These do not come from the 3D structure.

If you wanted to extract the base pairs from the 3D structure, you may wish to use something like RNAview, or R-scape (which uses RNAview internally, but can output directly to WUSS notation).

erik-whiting commented 4 months ago

Thank you!