dieterich-lab / DCC

DCC uses output from the STAR read mapper to systematically detect back-splice junctions in next-generation sequencing data. DCC applies a series of filters and integrates data across replicate sets to arrive at a precise list of circRNA candidates.
https://dieterichlab.org/software/
GNU General Public License v3.0
36 stars 20 forks source link

Problem with BAM file #96

Closed m-kouhsar closed 2 years ago

m-kouhsar commented 2 years ago

Hi I tried to run DCC for one paired end sample. first I run the STAR to align both R1 and R2 data with this command:

STAR --runThreadN 10 --genomeDir /genome_index --outSAMtype BAM SortedByCoordinate --readFilesIn /R1/S214_R1_001_fastp.fastq.gz /R2/S214_R2_001_fastp.fastq.gz --readFilesCommand zcat --outReadsUnmapped Fastx --outSJfilterOverhangMin 15 15 15 15 --alignSJoverhangMin 15 --alignSJDBoverhangMin 15 --seedSearchStartLmax 30 --outFilterMultimapNmax 20 --outFilterScoreMin 1 --outFilterMatchNmin 1 --outFilterMismatchNmax 2 --chimSegmentMin 15 --chimScoreMin 15 --chimScoreSeparation 10 --chimJunctionOverhangMin 15 --genomeLoad LoadAndKeep --limitBAMsortRAM 50000000000

and then run STAR for mate1 and mate2: STAR --runThreadN 10 --genomeDir /genome_index --outSAMtype None --readFilesIn /R1/S214_R1_001_fastp.fastq.gz --readFilesCommand zcat --outReadsUnmapped Fastx --outSJfilterOverhangMin 15 15 15 15 --alignSJoverhangMin 15 --alignSJDBoverhangMin 15 --seedSearchStartLmax 30 --outFilterMultimapNmax 20 --outFilterScoreMin 1 --outFilterMatchNmin 1 --outFilterMismatchNmax 2 --chimSegmentMin 15 --chimScoreMin 15 --chimScoreSeparation 10 --chimJunctionOverhangMin 15 --genomeLoad LoadAndKeep --limitBAMsortRAM 50000000000

STAR --runThreadN 10 --genomeDir /genome_index --outSAMtype None --readFilesIn /R2/S214_R2_001_fastp.fastq.gz --readFilesCommand zcat --outReadsUnmapped Fastx --outSJfilterOverhangMin 15 15 15 15 --alignSJoverhangMin 15 --alignSJDBoverhangMin 15 --seedSearchStartLmax 30 --outFilterMultimapNmax 20 --outFilterScoreMin 1 --outFilterMatchNmin 1 --outFilterMismatchNmax 2 --chimSegmentMin 15 --chimScoreMin 15 --chimScoreSeparation 10 --chimJunctionOverhangMin 15 --genomeLoad LoadAndKeep --limitBAMsortRAM 50000000000

The alignment process be completed without any Error. finally I run DCC with this command:

DCC @samplesheet -mt1 @mate1 -mt2 @mate2 -D -R /DCC/combine_repeat.gtf -an /genome_anno/gencode.v38.primary_assembly.annotation.gff3 -Pi -F -M -Nr 1 1 -fg -G -A /genome_fasta/GRCh38.primary_assembly.genome.fa -B @bam_files

@bam_files is a file contain the path of Aligned.sortedByCoord.out.bam

During the DCC process this Error has been occurred:

BAM file Aligned.sortedByCoord.out.bam has no index (Aligned.sortedByCoord.out.bam.bai is missing) The following BAM files seem to be not sorted by coordinate or are missing an index: Aligned.sortedByCoord.out.bam Error: not all BAM files are sorted by coordinate or are missing indices

Can you please help me to solve the problem?

tjakobi commented 2 years ago

Hi @m-kouhsar,

Did you run samtools index filename.bam for each of the BAM files produced by STAR?

Tobias

m-kouhsar commented 2 years ago

No, I didn't run samtools! I have run samtools and create bai file and then run the DCC again. The problem has been solved. Thank you @tjakobi