Open blakesweeney opened 1 year ago
You might get more reuse if you use a TSV:...
accession rfam_id seedBacteria seedEukaryota seedArchaea fullEukaryota fullBacteria fullArchaea
RF00001 5S_rRNA 48.6 45.51 5.9 87.59 12.0 0.4
If you think more people would prefer a tsv we can make one. We have talked about having more precise taxonomic assignments and I don't think that would fix as cleanly into a tsv.
We should create exports of the taxonomic assignments for all Rfam families. This is basically what is done the rfam-taxonomy repo but the exports should be part of our FTP. I'm thinking we should use a JSON file, which has entries for each family like:
I'm not sure if it should be one object per line (jsonl) or if there should be a single object with all families. I'm open to suggestions.