Closed yeredh closed 4 years ago
I guess that the ids are not consistent one-to-one match between read 1 and read2. You should check the fastq file.
Hi Yered,
So basically you are doing it compli right and the problem comes from the files. UMI-tools has been designed to process fastq files produced by Illumina devices. The files you are mentionning have been generated by a BGI machine : therefore the headers are a bit different. This is problematic but can be solved. First you need to install a specific version of UMI-tools : https://github.com/CGATOxford/UMI-tools/tree/%7BTS%7D-IgnoreReadPairSuffix. You then need to modify the extract line as describe here : https://github.com/CGATOxford/UMI-tools/issues/325 and it should do the job !
Hope this will help,
Best
Pierre
Thank you Pierre for your prompt reply!
Hello,
I downloaded the FASTQ files for sample GSM4339771 (SRR11181956) from SRA in the original format from https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR11181956
So I end up with two files
I was able to identify the cell barcodes with
umi_tools
However, when I tried the next step; extracting the barcodes and UMIs and add to read names
I get the following error message
What am I doing wrong?
Best,
Yered