single-cell-genetics / cellsnp-lite

Efficient genotyping bi-allelic SNPs on single cells
https://cellsnp-lite.readthedocs.io
Apache License 2.0
131 stars 11 forks source link

Genotyping ATAC from WES vcf #86

Closed ollieeknight closed 1 year ago

ollieeknight commented 1 year ago

Heya,

I'm trying to genotype around 3.5k cells from a 10x multiome run using the ATAC portion of the experiment, with a paired filtered WES vcf file containing around 100 variants.

cellsnp-lite -s multi/data/s1/outs/atac_possorted_bam.bam -b multi/data/s1/outs/filtered_feature_bc_matrix/barcodes.tsv.gz -O genotype -R wes_outs/mutect2/annotation/mutect2/s452_vs_si452/output.vcf -p 32 --minMAF 0 --minCOUNT 1 --UMItag None --gzip

It completes in less than 0s, and the size of the genotype folder is 512kb, and when I load the AD matrix into R it's just a field of zeros. Do you have any advice to help me get this going?

Thanks a million

Ollie

hxj5 commented 1 year ago

Hi Ollie,

Are there any SNPs outputed, i.e., cellSNP.base.vcf.gz contains any SNPs or the DP matrix has any non-zero values? Can you share more detailed logging information (just want to check if cellsnp-lite worked well)?

Xianjie

ollieeknight commented 1 year ago

yep, apologies! genotype.zip

hxj5 commented 1 year ago

thanks for sharing the folder. There are 64 SNPs outputed (in the VCF file), while the number of SNPs in every matrix is zero. There should be some errors. You may change the output directory and re-run the cmdline. It would be helpful if you could save the logging information and share it here.

ollieeknight commented 1 year ago

from running the same command I initally posted,

Out log
``` [I::main] start time: 2023-02-27 11:56:26 [W::check_args] Max depth set to maximum value (2147483647) [I::main] loading the VCF file for given SNPs ... [I::main] fetching 98 candidate variants ... [I::main] mode 1a: fetch given SNPs in 3867 single cells. [W::hts_idx_load3] The index file is older than the data file: multi/data/s1/outs/atac_possorted_bam.bam.bai [I::csp_fetch_core][Thread-2] 33.33% SNPs processed. [I::csp_fetch_core][Thread-4] 33.33% SNPs processed. [I::csp_fetch_core][Thread-4] 66.67% SNPs processed. [I::csp_fetch_core][Thread-20] 33.33% SNPs processed. [I::csp_fetch_core][Thread-14] 33.33% SNPs processed. [I::csp_fetch_core][Thread-10] 33.33% SNPs processed. [I::csp_fetch_core][Thread-31] 33.33% SNPs processed. [I::csp_fetch_core][Thread-26] 33.33% SNPs processed. [I::csp_fetch_core][Thread-14] 66.67% SNPs processed. [I::csp_fetch_core][Thread-7] 33.33% SNPs processed. [I::csp_fetch_core][Thread-11] 33.33% SNPs processed. [I::csp_fetch_core][Thread-7] 66.67% SNPs processed. [I::csp_fetch_core][Thread-2] 66.67% SNPs processed. [I::csp_fetch_core][Thread-16] 33.33% SNPs processed. [I::csp_fetch_core][Thread-28] 33.33% SNPs processed. [I::csp_fetch_core][Thread-28] 66.67% SNPs processed. [I::csp_fetch_core][Thread-24] 33.33% SNPs processed. [I::csp_fetch_core][Thread-16] 66.67% SNPs processed. [I::csp_fetch_core][Thread-27] 33.33% SNPs processed. [I::csp_fetch_core][Thread-27] 66.67% SNPs processed. [I::csp_fetch_core][Thread-9] 33.33% SNPs processed. [I::csp_fetch_core][Thread-3] 33.33% SNPs processed. [I::csp_fetch_core][Thread-20] 66.67% SNPs processed. [I::csp_fetch_core][Thread-18] 33.33% SNPs processed. [I::csp_fetch_core][Thread-31] 66.67% SNPs processed. [I::csp_fetch_core][Thread-25] 33.33% SNPs processed. [I::csp_fetch_core][Thread-25] 66.67% SNPs processed. [I::csp_fetch_core][Thread-24] 66.67% SNPs processed. [I::csp_fetch_core][Thread-8] 33.33% SNPs processed. [I::csp_fetch_core][Thread-5] 33.33% SNPs processed. [I::csp_fetch_core][Thread-10] 66.67% SNPs processed. [I::csp_fetch_core][Thread-19] 33.33% SNPs processed. [I::csp_fetch_core][Thread-9] 66.67% SNPs processed. [I::csp_fetch_core][Thread-17] 33.33% SNPs processed. [I::csp_fetch_core][Thread-30] 33.33% SNPs processed. [I::csp_fetch_core][Thread-30] 66.67% SNPs processed. [I::csp_fetch_core][Thread-19] 66.67% SNPs processed. [I::csp_fetch_core][Thread-22] 33.33% SNPs processed. [I::csp_fetch_core][Thread-12] 33.33% SNPs processed. [I::csp_fetch_core][Thread-23] 33.33% SNPs processed. [I::csp_fetch_core][Thread-18] 66.67% SNPs processed. [I::csp_fetch_core][Thread-11] 66.67% SNPs processed. [I::csp_fetch_core][Thread-22] 66.67% SNPs processed. [I::csp_fetch_core][Thread-8] 66.67% SNPs processed. [I::csp_fetch_core][Thread-1] 25.00% SNPs processed. [I::csp_fetch_core][Thread-1] 50.00% SNPs processed. [I::csp_fetch_core][Thread-3] 66.67% SNPs processed. [I::csp_fetch_core][Thread-21] 33.33% SNPs processed. [I::csp_fetch_core][Thread-5] 66.67% SNPs processed. [I::csp_fetch_core][Thread-12] 66.67% SNPs processed. [I::csp_fetch_core][Thread-23] 66.67% SNPs processed. [I::csp_fetch_core][Thread-21] 66.67% SNPs processed. [I::csp_fetch_core][Thread-15] 33.33% SNPs processed. [I::csp_fetch_core][Thread-26] 66.67% SNPs processed. [I::csp_fetch_core][Thread-17] 66.67% SNPs processed. [I::csp_fetch_core][Thread-0] 25.00% SNPs processed. [I::csp_fetch_core][Thread-29] 33.33% SNPs processed. [I::csp_fetch_core][Thread-13] 33.33% SNPs processed. [I::csp_fetch_core][Thread-6] 33.33% SNPs processed. [I::csp_fetch_core][Thread-1] 75.00% SNPs processed. [I::csp_fetch_core][Thread-15] 66.67% SNPs processed. [I::csp_fetch_core][Thread-29] 66.67% SNPs processed. [I::csp_fetch_core][Thread-13] 66.67% SNPs processed. [I::csp_fetch_core][Thread-0] 50.00% SNPs processed. [I::csp_fetch_core][Thread-6] 66.67% SNPs processed. [I::csp_fetch_core][Thread-0] 75.00% SNPs processed. [I::main] All Done! [I::main] end time: 2023-02-27 11:56:27 [I::main] time spent: 1 seconds. ```

genotype.zip

hxj5 commented 1 year ago

thanks for the detailed information. It seems the program worked well this time. The numbers of SNPs in the VCF file and three matrices are all 64, which is expected. In addition, there are some non-zero values in the AD matrix.