pombase / website

PomBase website v2
MIT License
6 stars 1 forks source link

Rfam matches for RNAs #39

Open mah11 opened 8 years ago

mah11 commented 8 years ago

For non-coding RNA genes, search against Rfam and display and link matches (analogous to InterProScan for protein-coding genes).

kimrutherford commented 4 years ago

For non-coding RNA genes, search against Rfam and display and link matches (analogous to InterProScan for protein-coding genes).

For the protein-coding genes we can download most of what we need from InterPro. Is there an an equivalent for RNA?

ValWood commented 4 years ago

I'm sure @AntonPetrov will tell us ;)

AntonPetrov commented 4 years ago

Glad you are thinking about showing Rfam annotations!

If you would like to compute Rfam matches on your end, here are the docs: https://docs.rfam.org/en/latest/genome-annotation.html

Alternatively, you could download Rfam data for PomBase sequences from RNAcentral: ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/rfam/

For what it's worth, here is how we display the Rfam annotations on sequence pages:

Screenshot 2020-04-21 at 09 54 03

Let me know if you have any questions or run into any problems!

kimrutherford commented 4 years ago

Hi Anton. Thanks for replying.

Alternatively, you could download Rfam data for PomBase sequences from RNAcentral:

Could you let me know how to associate the IDs in rfam_annotations.tsv.gz with PomBase IDs?

Thanks!

AntonPetrov commented 4 years ago

Hi Kim - sure thing! The mapping can be found in the following file:

ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/id_mapping/database_mappings/pombase.tsv

Hope it helps!

kimrutherford commented 4 years ago

Perfect. Thanks Anton.

kimrutherford commented 4 years ago

The mapping can be found in the following file: ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/id_mapping/database_mappings/pombase.tsv

Sorry it's taken me so long to work on this. As a first step our database loading code now reads and stores the RNAcentral IDs from the database mapping file.

Next steps are to pull the pombe data out of the main RNAcentral data file (ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/rfam/rfam_annotations.tsv.gz) and then decide how to display things.

kimrutherford commented 4 years ago

Next steps are to pull the pombe data out of the main RNAcentral data file

That's done now so it gets processed every night along with all the other input files. We need to decide how to display the information.

ValWood commented 3 years ago

Let's discuss on Thursday. What data items do we need to display? Is it everything in the screenshot above?

ValWood commented 3 years ago

Header RNA families

Name Identifier Start End hyperlink http://rfam.xfam.org/search?q=SAM%20riboswitch

ValWood commented 1 year ago

This seems to be partly done, so if it can be completed quickly go ahead... (We probably need to discuss display first)