UMI is known but primers are not

ssnn-airr commented 3 years ago

Original report by Anton Kulaga (Bitbucket: [Anton Kulaga](https://bitbucket.org/Anton Kulaga), ).

We are using library preparation kits from https://irepertoire.com/ , unfortunately they rejected to give us their primer sequences, so we only know UMI. The problem is that pResto assumes that we know primers and makes primers parameters mandatory, how can we overcome the issue and mask UMI without masking primers (which we simply do not know)?

‌

ssnn-airr commented 3 years ago

Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).

You can skip the primer identification steps. These are largely for QC and isotype annotation. You can pull the UMI out of the sequences using MaskPrimers-extract by specifying the length (--len) and start position (--start) of the UMI. Eg:

MaskPrimers.py extract -s in.fastq --start 0 --len 15 --pf UMI -o out.fastq

Will put the first 15 bp in the field UMI. Or:

MaskPrimers.py extract -s in.fastq --start 15 --len 25 --bf UMI --pr PRIMER --barcode -o out.fastq

Will put the first 15 bp in the UMI field and the next 25 bp in the PRIMER field - for dual barcodes setups or if you want to guess what their primers are.

For annotating the C-region without the official primers, you can just align against the C-region references:

https://presto.readthedocs.io/en/stable/examples/primers.html

More C-region substrings, based on the TakaraBio/ClonTech protocol, are compiled here:

https://bitbucket.org/kleinstein/immcantation/src/master/protocols/Universal/

ssnn-airr commented 2 years ago

Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).

No news is good news.

immcantation / presto

UMI is known but primers are not #85