Open ValWood opened 1 year ago
In the data file it just has a "molecule name" for the RNAs. For your example it has:
I have mailed pdb and asked if they can include RNACentral IDs. @blakesweeney @afg1
Hi @blakesweeney and @afg1
At the moment the data table we use is downloaded from the PDBe Advanced search after querying for genus "Schizosaccharomyces" and selecting all datatypes: https://github.com/pombase/pombase-chado/wiki/Updating-the-PDB-data-file
If you would just like the Rfam connection then you can fetch that using the file we generate for PDBe each week. There is a file with the match between Rfam families and pdb chains at:
https://ftp.ebi.ac.uk/pub/databases/Rfam/.preview/pdb_full_region.txt.gz
I'm happy to explain what the file means if needed. The RNAcentral mappings are currently only updated at release time and present here:
https://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/id_mapping/database_mappings/pdb.tsv
We are working on updating some RNAcentral data weekly, but it is not yet publically avaiable.
Thanks very much Blake. I think pdb.tsv
is what we need because we have the URS IDs in PomBase. We'll give it a go.
Actually it isn't the RFAm connection we need, it's the actual RNACentral ID which we would then use to identify the equivalent PomBAse entity.
We list the proteins present in a structure, but not the ncRNAs. For example, this structure has U2 snRNA. I presume these have RNACentral IDs , so we might be able to get these IDs too from a mapping?