single-cell-genetics / cellsnp-lite

Efficient genotyping bi-allelic SNPs on single cells
https://cellsnp-lite.readthedocs.io
Apache License 2.0
124 stars 11 forks source link

10x scATAC-seq like bam file, No reads captured with "--UMItag None" #104

Closed jiehuichen closed 9 months ago

jiehuichen commented 9 months ago

Dear cellsnp-lite team,

I have some scATAC-seq like bam files. As the below shows, they have different structures than 10x bam files.

VL00465:23:AACJJH3M5:1:2404:0:1723012_R1.038,R2.079,R3.035,P1.95 99 chr1 189040 23 51M = 189445 456 TGGTGCCATCCAGGGGGCCTCTACAAGGATAATCTGACCTGCAGGGTCGAG CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC AS:i:0 RG:Z:R1.038,R2.079,R3.035,P1.95

I used the codes below to detect SNPs, there were no reads captured even with "--UMItag None" added.

BAM=ATAC-like.bam
BARCODE=filtered_barcodes.tsv
REGION_VCF=genome1K.phase3.SNP_AF5e2.chr1toX.hg38.vcf

cellsnp-lite -s $BAM -b $BARCODE -O ./filtered_bam -R $REGION_VCF -p 20 --genotype --UMItag None --cellTAG RG

Did I do something wrong?

All the realted files are here.

Thanks for you help.

Best,

jiehuichen commented 9 months ago

`

hxj5 commented 9 months ago

Hi, I do not have access to the shared google-drive files. The barcodes in the -b and --cellTAG should exactly match, while it seems the barocdes in the RG tag are a combination of several group names. You may double check whether there is mismatch of the barcodes.

jiehuichen commented 9 months ago

Hi, I do not have access to the shared google-drive files. The barcodes in the -b and --cellTAG should exactly match, while it seems the barocdes in the RG tag are a combination of several group names. You may double check whether there is mismatch of the barcodes.

Thanks Xianjie, I compared the barcodes with Venny 2.0 and found there're no mismatch.

I've re-uploaded the data, here is the link

Thanks for you great help.

hxj5 commented 9 months ago

Hi, it seems the hg38 VCF file was truncated, containing only 1072 SNPs on chr1. When I run cellsnp-lite (v1.2.3) on whole SNP list (with default settings), there are around 100 SNPs genotyped. You may check your VCF file or re-download it from this page.

jiehuichen commented 9 months ago

Hi, it seems the hg38 VCF file was truncated, containing only 1072 SNPs on chr1. When I run cellsnp-lite (v1.2.3) on whole SNP list (with default settings), there are around 100 SNPs genotyped. You may check your VCF file or re-download it from this page.

Thanks Xianjie, I have addressed this issue with your great help. I just replaced the vcf file with 'genome1K.phase3.SNP_AF5e4.chr1toX.hg38.vcf.gz' in this page.