Closed ssmadha closed 3 years ago
Hi Shariq, this is nice. It would be better to set it as optional? Also, this reminds me why R1 is longer than barcode + umi? is it 5' scRNAseq which contains the template-switching oligos? we can discuss today in the meeting.
I'll modify it to make it optional. I believe Novogene, where we usually sequence, just sequences 150bp for both R1 and R2, in case we want more of the R1, even with the 3' sequencing we use.
@baigal628 can you help to review and test this PR?
Hi,
I am facing this issue of long rna barcode reads. Please suggest a fix for this.
Hello,
For now, you should be able to use the following code to trim down your fastq file so it will work in the program:
gunzip -c <original_barcode_file> | awk -v barlen=$((<barcode_length>+<umi_length>)) '{if((!($1 ~/^@/) && !($0==\"+\")){print substr($0,1,barlen)} else {print}}' | gzip -c > <trimmed_barcode_file>
If you are using 10x v3 chemistry, your barcode_length and umi_length will likely be 16 and 12, respectively.
Hi,
I trimmed the barcode fastq file as suggested and tried running but getting a different error now: terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::erase
I could run STAR solo without any error when I set the following option --soloBarcodeReadLength 0. This was mentioned in: https://github.com/alexdobin/STAR/releases
for now, you can add --soloBarcodeReadLength 0 to https://github.com/liulab-dfci/MAESTRO/blob/master/MAESTRO/Snakemake/scRNA/Snakefile#L48 after you initiate the Snakefile, we will fix it in the next release.
This is a fix for when the barcode scRNA fastq file reads run longer than the barcode and umi, which raises an error in STARSolo. This fix just trims down these reads to the length of barcode + umi.