airr-community / ogrdb

Website and associated database for managing submissions of inferred alleles
Other
8 stars 1 forks source link

Don't include orphon genes in statistics #65

Closed williamdlees closed 4 years ago

williamdlees commented 5 years ago

From Andrew Collins:

One feature that I would suggest changing is the inclusion of Orphon genes (e.g. IGHV1/OR15-1*01). This is something I discussed with Gur today at length. VDJbase reports a surprising number of genotypes that include orphons, but I find this almost impossible to believe. These are sequences that are located on other chromosomes, and for them to be incorporated into a VDJ rearrangement requires a process that has never been described. Given Lanzavecchia's talk at Genoa, I know that nothing can ever be ruled out, but I would say that until an analysis confirms the alignments as real/reasonable, they should not appear in either VDJbase or OGRDB.

Their inclusion puzzles me. It does not seem that pseudogenes are included, or ORFs, but orphons are. If anything, I would include pseudogenes, though they would need to be associated with very low thresholds in the Statistics page. (I have seen them expressed at a frequency of about 1 in 10,000 sequences)

williamdlees commented 5 years ago

and "Certainly the inclusion of OR in the gene name identifies it as an orphon, but not an ORF."

williamdlees commented 4 years ago

Done