liulab-dfci / TRUST4

TCR and BCR assembly from RNA-seq data
MIT License
272 stars 47 forks source link

TRUST4 on 3' results in inference where no inference should be present #167

Open rohitarorayyc opened 1 year ago

rohitarorayyc commented 1 year ago

Running TRUST4 on 3' 10X genomics RNA-seq results in B cell inference where no B cells are present, any advice on fixing this issue?

mourisl commented 1 year ago

Most reads in the 3' data cover the constant gene region, so it is expected to have very low sensitivity in finding VDJ information.

mourisl commented 1 year ago

Oh, sorry I misunderstood your question. This usually happens with the existence of plasma B cells, and you can use the script barcoderep-filter.py in the script folder to filter that. You can run the command "python3 barcoderep-filter.py -b trust_barcode_report.tsv -a trust_annot.fa". If you find the filtering is not satisfactory, you can use lower value for "--highAbund" or higher value for --diffuseFrac.

The phenomenon is due to the leaked mRNAs as mentioned in https://support.10xgenomics.com/single-cell-vdj/software/pipelines/latest/algorithms/cell-calling

rohitarorayyc commented 1 year ago

Thank you! I think that the barcoderep-filter.py would help with this dataset!