t-neumann / slamdunk

Streamlining SLAM-seq analysis with ultra-high sensitivity
GNU Affero General Public License v3.0
37 stars 22 forks source link

My slam-seq sequencing data only has 60-70% reads mapped to the genome #150

Closed bioinformatica closed 5 months ago

bioinformatica commented 5 months ago

Dear Tobias:

My slam-seq sequencing data only has 60-70% reads mapped to the genome. Is this normal? How to improve? I saw that 91% of the work you published.

The following is my command line and log: slamdunk all -r /Data/hg38.fa -t 60 -n 100 -b /Data/3utr-ucsc-v45.bed -o /Data/dunk /Data/*_1_R1.fq.gz

[MAIN] Done (3275776 reads mapped (67.49%), 1577827 reads not mapped, 5727857 lines written)

Best

t-neumann commented 5 months ago

No we typically have >90% mapped as you point out

bioinformatica commented 5 months ago

Was the sample contaminated, or was it a matter of processing the data? I'm using only one of the fastq files with paired-end sequencing. I get rawdata from sequencing company. Do I need to preprocess the input fastq file of "slamdunk all" myself, such as using fastp filtering or cut adapter?

bioinformatica commented 5 months ago

I tried to directly use rawdata to input into "slamdunk all", and the mapped reads increased to 80%, but it was still far less than 90%.

t-neumann commented 5 months ago

Yes we typically run cutadapt, but other than that, you should be good to go. 80% doesn't sound too bad