Open sunshine1126 opened 3 years ago
Hello!
I also ran java -jar picard.jar MarkDuplicates
directly on the possorted_bam.bam file from cellranger but did not have this issue. Could you share the codes you used to run MarkDuplicates? Thanks
Hello! I also ran
java -jar picard.jar MarkDuplicates
directly on the possorted_bam.bam file from cellranger but did not have this issue. Could you share the codes you used to run MarkDuplicates? Thanks
Hello! @seasoncloud, thanks for your reply. My code is as follows.
# install gatk (version 4.2.2.0)
conda create -n GATK4 gatk4
conda activate GATK4
inpath=~/data/scATAC-seq/analysis/cellrangeratac_count_results/G1/outs
out_path=~/data/scATAC-seq/analysis/mutation/hg38
sample_name=G1
#rm.dup
echo "start MarkDuplicates for ${sample_name}"
gatk --java-options "-Xmx128G" MarkDuplicates \
-I ${inpath}/possorted_bam.bam \
-M ${out_path}/${sample_name}.possorted_rmdup_marked_dup_metics.txt \
--VALIDATION_STRINGENCY SILENT \
--REMOVE_DUPLICATES true\
-O ${out_path}/${sample_name}.possorted_rmdup.bam
In addition, I have another question about the fa file when I ran gatk HaplotypeCaller. There would suggest an error if I used the fa file from the refdata-cellranger-arc-GRCh38-2020-A-2.0.0/fasta/genome.fa. Do I use the references from the gatk website?Thanks again!
Not sure if the markduplicates issue is because of this, but I used
java -jar picard.jar MarkDuplicates
but not the gatk one.
You could check more details here: https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard-
For using the Haplotypecaller, it's better using the same version of .fa file as the one you used for the alignment (bam file). I think you used different version of reference genome (hg19) when you did the alignment.
Hello, it suggested an error when the possorted_bam.bam from cellrange-atac count was used to gatk MarkDuplicates. It still displays an error using gatk MarkDuplicates after the possorted_bam.bam was sort by samtools. Can you help me resolve this problem? Thanks