milaboratory / mixcr

MiXCR is an ultimate software platform for analysis of Next-Generation Sequencing (NGS) data for immune profiling.
https://mixcr.com
Other
328 stars 79 forks source link

Questions about preset and kAligner2 #1009

Closed jeongyeonkim95 closed 1 year ago

jeongyeonkim95 commented 1 year ago

Hello, Thanks for developing wonderful tool.

I had a few questions and would really appreciate your insight.

1) I have BCR sequencing conducted by Mi-seq, which are not one of the presets. In this case, would rnaseq-full-length be the most appropriate option?

2) I am trying to find somatic hypermutation in B cell, and the documentation recommends using 'kAligner2' for highly mutable sequences. the kAligner2 seems to be deprecated in the newer version. What would be an alternate approach to capture highly mutable sequence?

Thanks in advance

mizraelson commented 1 year ago

Hi, Can you please describe your library structure/protocol in more detail? Also kAligner2 is still available and is the best choice for BCR data. I can help you in creating a specific preset (every preset includes an aligner option) if you can share the details of the library structure:

jeongyeonkim95 commented 1 year ago

Hi, Thanks for the quick response.

I have tried command with no luck mixcr align --species hs --preset kAligner2 test.fasta test2.vdjca

which returns ''' No preset with name "kAligner2". Here are supported presets with similar names:

It is commercially available kit : Illumina Mi-seq, generating 2x300bp paired end reads Mi-seq is amplifies genomic DNA with sequencing by synthesis (bridge) - same mechanism as HiSeq I guess you could say that as typical Illumina DNA sequencing prep. They have adaptors, which can be removed by adapter trimming ( guessing that this may be a necessary prestep)

Thanks again for providing insight into solving my issues

mizraelson commented 1 year ago

Hi, I'm sorry, I meant the kit or the protocol you used for cDNA library generation (I'm guessing you amplify BCRs using 5'RACE or multiplex), not the sequencing kit.

jeongyeonkim95 commented 1 year ago

Oh yes, it was conducted with 5'RACE

mizraelson commented 1 year ago

And the reverse primer is located in the C gene? you don't have any UMI barcodes in the structure?

jeongyeonkim95 commented 1 year ago

universal 5′ RACE primer IIA (Clontech), 5′AAGCAG TGGTATCAACGCAGAG 3′ and no UMI barcode

mizraelson commented 1 year ago

Got it, then I would recommend using the following command:

mixcr analyze generic-bcr-amplicon \
    --species hsa \
    --rna
    --rigid-left-alignment-boundary \
    --rigid-right-alignment-boundary C \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Notice that you might have to change the name of the species.

This command will execute all of the steps needed including alignment, assemble and export.

Also, if you want to run mixcr align separately, you can use the preset generic-bcr-amplicon, which includes kAligner2.

Let me know if it helps.

jeongyeonkim95 commented 1 year ago

Thanks so much! I will give that command a try.