milaboratory / mixcr

MiXCR is an ultimate software platform for analysis of Next-Generation Sequencing (NGS) data for immune profiling.
https://mixcr.com
Other
335 stars 79 forks source link

Pig TCRa chain ref #1434

Closed Ahmedalaraby20 closed 1 year ago

Ahmedalaraby20 commented 1 year ago

Hi Mixcr team, I am planning to run a bulk TCR sequencing experiment for pig T cells. I have noticed that the reference from IMGT only contains TRBC, TRBV, and TRBJ. I also came across a paper in which they published some TRA sequencing - attached ji_1801171_supplemental_table_1 (1).xlsx

How can I use this and add it to the ref?

Thanks alot

mizraelson commented 1 year ago

Hi, It seems that the sequences in the table do not fully cover the V gene in many cases and sometimes include either a complete or partial CDR3. To use these sequences, you should first align them to the genome and, using RSS sequences, extract the complete V and J genes. Then, create FASTA files with a list of genes for each gene segment, like this:

>IGHV12-348
GATGCTGGAGTTATCCAGTCACCCCGCCATGAGGTGACAGAGATGGGACAAGAAGTGACTCTGAGATGTAAACCA
ATTTCAGGCCACAACTCCCTTTTCTGGTACAGACAGACCATGATGCGGGGACTGGAGTTGCTCATTTACTTTAAC
AACAACGTTCCGATAGATGATTCAGGGATGCCCGAGGATCGATTCTCAGCTAAGATGCCTAATGCATCATTCTCC
ACTCTGAAGATCCAGCCCTCAGAACCCAGGGACTCAGCTGTGTACTTCTGTGCCAGCAGTTTAGC

To create a TRA library, use a command similar to the one below:

mixcr buildLibrary \
  --v-genes-from-fasta v-genes.TRA.fasta \
  --v-gene-feature VRegion \
  --j-genes-from-fasta j-genes.TRA.fasta \
  --c-genes-from-fasta c-genes.IGH.fasta \ # optional
  --chain TRA \
  --taxon-id 9823 \
  --species pig \
  pig-TRA.json.gz

Next, merge this library with IMGT reference for MiXCR:

mixcr mergeLibrary \
    pig-TRB.json.gz \
    imgt.202214-2.sv8.json.gz \
    my-custom-imgt.json.gz \

Then use the library like this:

mixcr analyze generic-amplicon \
    --library my-custom-imgt \
    --species pig \
    --rna \
    --rigid-left-alignment-boundary \
    --floating-right-alignment-boundary C \
    input_R1.fastq.gz \
    input_R2.fastq.gz \
    output

We also have a dedicated tutorial on this topic.