Closed Ahmedalaraby20 closed 1 year ago
Hi, It seems that the sequences in the table do not fully cover the V gene in many cases and sometimes include either a complete or partial CDR3. To use these sequences, you should first align them to the genome and, using RSS sequences, extract the complete V and J genes. Then, create FASTA files with a list of genes for each gene segment, like this:
>IGHV12-348
GATGCTGGAGTTATCCAGTCACCCCGCCATGAGGTGACAGAGATGGGACAAGAAGTGACTCTGAGATGTAAACCA
ATTTCAGGCCACAACTCCCTTTTCTGGTACAGACAGACCATGATGCGGGGACTGGAGTTGCTCATTTACTTTAAC
AACAACGTTCCGATAGATGATTCAGGGATGCCCGAGGATCGATTCTCAGCTAAGATGCCTAATGCATCATTCTCC
ACTCTGAAGATCCAGCCCTCAGAACCCAGGGACTCAGCTGTGTACTTCTGTGCCAGCAGTTTAGC
To create a TRA library, use a command similar to the one below:
mixcr buildLibrary \
--v-genes-from-fasta v-genes.TRA.fasta \
--v-gene-feature VRegion \
--j-genes-from-fasta j-genes.TRA.fasta \
--c-genes-from-fasta c-genes.IGH.fasta \ # optional
--chain TRA \
--taxon-id 9823 \
--species pig \
pig-TRA.json.gz
Next, merge this library with IMGT reference for MiXCR:
mixcr mergeLibrary \
pig-TRB.json.gz \
imgt.202214-2.sv8.json.gz \
my-custom-imgt.json.gz \
Then use the library like this:
mixcr analyze generic-amplicon \
--library my-custom-imgt \
--species pig \
--rna \
--rigid-left-alignment-boundary \
--floating-right-alignment-boundary C \
input_R1.fastq.gz \
input_R2.fastq.gz \
output
We also have a dedicated tutorial on this topic.
Hi Mixcr team, I am planning to run a bulk TCR sequencing experiment for pig T cells. I have noticed that the reference from IMGT only contains TRBC, TRBV, and TRBJ. I also came across a paper in which they published some TRA sequencing - attached ji_1801171_supplemental_table_1 (1).xlsx
How can I use this and add it to the ref?
Thanks alot