kmayerb / tcrdist3

flexible CDR based distance metrics
MIT License
53 stars 17 forks source link

Some of the mouse TRBV is missing #44

Open marshelma opened 3 years ago

marshelma commented 3 years ago

Got this error with TRBV18 and TRBV25

/Users/max2/anaconda3/envs/tcrdist3/lib/python3.8/site-packages/tcrdist/repertoire.py:466: UserWarning: TRBV1801 gene was not recognized in reference db no cdr seq could be inferred f0 = lambda v : self._map_gene_to_reference_seq2(gene = v, /Users/max2/anaconda3/envs/tcrdist3/lib/python3.8/site-packages/tcrdist/repertoire.py:466: UserWarning: TRBV2501 gene was not recognized in reference db no cdr seq could be inferred f0 = lambda v : self._map_gene_to_reference_seq2(gene = v, /Users/max2/anaconda3/envs/tcrdist3/lib/python3.8/site-packages/tcrdist/repertoire.py:470: UserWarning: TRBV1801 gene was not recognized in reference db no cdr seq could be inferred f1 = lambda v : self._map_gene_to_reference_seq2(gene = v, /Users/max2/anaconda3/envs/tcrdist3/lib/python3.8/site-packages/tcrdist/repertoire.py:470: UserWarning: TRBV2501 gene was not recognized in reference db no cdr seq could be inferred f1 = lambda v : self._map_gene_to_reference_seq2(gene = v, /Users/max2/anaconda3/envs/tcrdist3/lib/python3.8/site-packages/tcrdist/repertoire.py:474: UserWarning: TRBV1801 gene was not recognized in reference db no cdr seq could be inferred f2 = lambda v : self._map_gene_to_reference_seq2(gene = v, /Users/max2/anaconda3/envs/tcrdist3/lib/python3.8/site-packages/tcrdist/repertoire.py:474: UserWarning: TRBV2501 gene was not recognized in reference db no cdr seq could be inferred f2 = lambda v : self._map_gene_to_reference_seq2(gene = v, /Users/max2/anaconda3/envs/tcrdist3/lib/python3.8/site-packages/tcrdist/repertoire.py:190: UserWarning: Not all cells/sequences could be grouped into clones.2 of 1064 were not captured. This occurs when any of the values in the index columns are null or missing for a given sequence. To see entries with missing values use: tcrdist.repertoire.TCRrep._show_incomplete() self.deduplicate()

kmayerb commented 3 years ago

Currently only supported genes are those in: https://github.com/kmayerb/tcrdist3/blob/master/tcrdist/db/alphabeta_gammadelta_db.tsv

Pseudogenes may not be currently available (P) http://www.imgt.org/IMGTrepertoire/index.php?section=LocusGenes&repertoire=genetable&species=Mus_musculus&group=TRBV

TRBV1801 - Fct: FUNCTIONALITY P: Pseudogene TRBV2501 - Fct: FUNCTIONALITY P: Pseudogene

http://www.imgt.org/genedb/multipleEntries.action;jsessionid=C8364C79DC4919D4E11A630D7AC4002F