veltenlab / CloneTracer

This repository contains scripts to identify healthy and malignant cells from scRNAseq with CloneTracer and process data from Optimized 10x libraries
MIT License
24 stars 1 forks source link

Optimized 10x - smaller read2 length #4

Closed modalaigh closed 2 days ago

modalaigh commented 1 month ago

Hi, thanks for developing such a useful tool!

Our lab is interested in applying the Optimized 10x method to AML scRNA-seq samples we already sequenced as part of a recent publication. I ran the design_primers.R script for one of our samples following the instructions in the primer_design/README.md but got the following error during the generation of the inner primers:

Designing inner primers
Error in validObject(.Object) : 
  invalid class “TsIO” object: product_size_range too narrow to allow min_primer_range
Calls: TAPseqInput ... <Anonymous> -> new -> initialize -> initialize -> validObject
Execution halted

I had a look at the relevant source code of TAPseq which was generating the error:

if (diff(object@product_size_range) < object@min_primer_region) {
        err <- c(err, "product_size_range too narrow to allow min_primer_range")
      }

From this, it appears that the default value of 100 for min_primer_region is too large for the inner primer product_size_range generated when read2 length = 91 bp which was the read2 length we used to sequence our samples. I believe your samples had a read2 length of 120 bp.

To get around this error, I lowered the min_primer_region value during the inner primer design step:

inner_primers <- TAPseqInput(sequences,
                             product_size_range = c(90, 90+opt$read_length-15),
                             min_primer_region = diff(c(90, 90+opt$read_length-15)), 
                             primer_num_return = 5,
                             target_annot = ranges_list)

I wanted to clarify with you if this would be the correct modification to use the script with smaller read2 lengths?

Also, I'm wondering if the -r parameter of the design_primers.R script refers specifically to the read2 length from the sequencing of the targeted genotyping libraries - that is to say if we were planning on sequencing our genotyping libraries with a read2 length of 120 bp (despite the original gene expression read2 length being 91 bp) would the -r argument be 120?

sergibeneyto commented 4 days ago

Hi,

apologies for the very late reply. Thanks for your question and your interest in Optimized 10x libraries.

The -r parameter in the workflow should indicate the length of read2 when sequencing Optimized 10x libraries. This is needed to make sure that the inner primer is not located too far away from the mutation of interest. It corresponds to the read2 in the Optimized 10x sequencing and therefore it is unrelated to the read2 length of your gene expression libraries. If your the length of your read2 = 91bp, just set the -r parameter to this value. Hopefully this solves the problem.

Your solution also seems appropiate to me. You can manually reduce the minimum region to design the inner primer and hope that primer3 (software used under the hood by the pipeline) can still design primers on a smaller region.

I hope this helps.

Best, Sergi

modalaigh commented 2 days ago

Hi Sergi,

That's great, thanks for clarifying.