hxj5 / xcltk

Toolkit for XClone preprocessing to detect CNVs from scRNA-seq data
Apache License 2.0
3 stars 2 forks source link

VCF issue? #9

Open cnk113 opened 2 months ago

cnk113 commented 2 months ago

Hi I'm running xcltlk baf with the provided vcf but it looks to be failing to parse?

(venv) chang@nidus:/media/sdl/chang/jd/GW14_TS$ xcltk baf --label GW14_TS --sam trimmed_Aligned.sortedByCoord.out.bam --barcode pipseeker/filtered_matrix/sensitivity_5/barcodes.tsv --snpvcf ~/genome1K.phase3.SNP_AF5e2.chr1toX.hg38.vcf.gz --region ~/annotate_genes_hg38_update_20230126.txt --outdir xctlk_baf --gmap ~/Eagle_v2.4.1/tables/genetic_map_hg38_withX.txt.gz --eagle ~/Eagle_v2.4.1/eagle --paneldir ~/1000G_hg38/ --ncores 24
[I::pipeline::pipeline_wrapper] xcltk BAF preprocessing starts ...
[I::pipeline::pipeline_wrapper] check args ...
[I::pipeline::pipeline_wrapper] run in 'droplet' mode (genome version 'hg38')
[I::pipeline::pipeline_wrapper] start pileup ...
[I::pipeline::pipeline_wrapper] pileup VCF is 'xctlk_baf/pileup/cellSNP.base.vcf.gz'.
[I::pipeline::pipeline_wrapper] prepare VCF files for phasing ...
[I::pipeline::pipeline_wrapper] add genotypes ...
Traceback (most recent call last):
  File "/home/chang/miniconda3/envs/venv/bin/xcltk", line 8, in <module>
    sys.exit(main())
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/xcltk/xcltk.py", line 49, in main
    elif command == "baf": baf_baf(sys.argv)
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/xcltk/baf/pipeline.py", line 118, in pipeline_main
    ret = pipeline_wrapper(
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/xcltk/baf/pipeline.py", line 234, in pipeline_wrapper
    vcf_add_genotype(
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/xcltk/baf/genotype.py", line 365, in vcf_add_genotype
    variants, header = vcf_load(in_fn)
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/xcltk/utils/vcf.py", line 62, in vcf_load
    variants = pd.read_csv(fn, sep = "\t", header = None, comment = "#",
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/pandas/util/_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 950, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 605, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1442, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1753, in _make_engine
    return mapping[engine](f, **self.options)
  File "/home/chang/miniconda3/envs/venv/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 79, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 554, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
hxj5 commented 2 months ago

Hi, it seems the VCF file of pileup SNP, "xctlk_baf/pileup/cellSNP.base.vcf.gz", is empty. You may check the VCF file. If it is indeed empty, please refer to a related question (in cellsnp-lite) to update the xcltk command-line and inputs accordingly: Why the output files are empty, no SNPs genotyped?

cnk113 commented 2 months ago

Well I'm using the genome1K.phase3.SNP_AF5e2.chr1toX.hg38.vcf.gz which is downloaded from the link you provide in your tutorial and has columns.

hxj5 commented 2 months ago

Hi, sorry for the misleading. I mean you may check the file xctlk_baf/pileup/cellSNP.base.vcf.gz (which is also a VCF file), whether it is empty (or contains no SNPs). If empty, you may check and update the command-line arguments and inputs based on the question Why the output files are empty, no SNPs genotyped?