milaboratory / mixcr

MiXCR is an ultimate software platform for analysis of Next-Generation Sequencing (NGS) data for immune profiling.
https://mixcr.com
Other
325 stars 79 forks source link

Artificial Diversity Correction #1608

Closed bshim181 closed 5 months ago

bshim181 commented 5 months ago

Hello,

I am running generic-amplicon preset for my TCR data with UMI length of 12. I am getting high percentage of aritifical diversity eliminated and was wondering why this might be.

Screenshot 2024-04-02 at 5 04 05 PM

I am wondering if I specified the parameters wrong. This is what my read architecture resembles.

Screenshot 2024-04-02 at 5 11 43 PM

This is the parameter that I executed.

tag_pattern='^(UMI:N{12})(R1:)\^(R2:)' java -Xmx60g -jar $mixcr analyze generic-amplicon-with-umi \ --species hsa \ --assemble-clonotypes-by CDR3 \ --export-productive-clones-only \ --rna \ -f \ --tag-pattern $tag_pattern \ -Massemble.cloneAssemblerParameters.addReadsCountOnClustering=true \ --rigid-left-alignment-boundary \ --floating-right-alignment-boundary C \ ${R1_file} \ ${R2_file} \ ${out_dir}/${filenamewoExt}${run}/${filename_woExt}

mizraelson commented 5 months ago

Hi, the results look fine, the important thing here is that the records weight accepted is 98.68%, meaning most of the reads were ppreserved.