single-cell-genetics / cellsnp-lite

Efficient genotyping bi-allelic SNPs on single cells
https://cellsnp-lite.readthedocs.io
Apache License 2.0
124 stars 11 forks source link

How to detect all scaffolds (instead of chromosomes) in a bam file #132

Open BrunaLuz opened 1 month ago

BrunaLuz commented 1 month ago

I am working with a non-model organism whose genome was constructed at the scaffold level. How can I automatically name all scaffolds from my bam file using the command --chrom? Thank you!

hxj5 commented 1 month ago

Hi, the --chrom option is used for pileup reads from BAM files by matching the input "chrom" names with the RNAME field (the third column) of each BAM (alignment) record. It should work in your case if the "scaffold" names are stored in the RNAME field of BAM records and also specified in --chrom option.

For now --chrom option can not automatically extract all chrom/scaffold names from BAM files. These names have to be manually specified, which should be straightforward, e.g., if your BAM file has been indexed (i.e., ".bai" file exists), then the list of chrom/scaffold names can be obtained with samtools idxstats <BAM file> | cut -f1 | tr '\n' ',' (need further check).