a-xavier / tapes

TAPES : a Tool for Assessment and Prioritisation in Exome Studies
24 stars 11 forks source link

Error when sorting VEP HG37 vcf file #25

Open GACGAMA opened 1 year ago

GACGAMA commented 1 year ago

Hi! I'm trying to use TAPES on a VEP annotated multisample VCF file. I used VEP 109.

python3 /home/ggama1/programs/tapes/tapes.py sort -i /data/scratch/ggama1/vcf_3777_tapestest/M_Valle_20230522_3777samples.10klines.vep.vcf -o /data/scratch/ggama1/vcf_3777_tapestest/tapes/multisample_hg37/ --tab --by_gene --by_sample

Which gives:

No acmg_db path given and no db_config.json found
Default is: /home/ggama1/programs/tapes/acmg_db

        ***TAPES: SORT***

2023-08-31 16:08:23.....Output type: FOLDER
2023-08-31 16:08:23.....VEP vcf processing Traceback (most recent call last):
  File "/home/ggama1/programs/tapes/tapes.py", line 355, in <module>
  File "/home/ggama1/programs/tapes/tapes.py", line 205, in main
    full_stuff, soft_used = tf.open_csv_file(file_path, acmg_db_path)    # Load annotated csv in pandas
  File "/home/ggama1/programs/tapes/src/t_func.py", line 70, in open_csv_file
    dataframe = vp.vep_process_vcf(csv_file, acmg_db_path)
  File "/home/ggama1/programs/tapes/src/vep_process.py", line 43, in vep_process_vcf
    df['CHROM'] = df['CHROM'].str.replace('chr', '')
  File "/home/ggama1/.local/lib/python3.7/site-packages/pandas/core/generic.py", line 5458, in __getattr__
    return object.__getattribute__(self, name)
  File "/home/ggama1/.local/lib/python3.7/site-packages/pandas/core/accessor.py", line 180, in __get__
    accessor_obj = self._accessor(obj)
  File "/home/ggama1/.local/lib/python3.7/site-packages/pandas/core/strings/accessor.py", line 154, in __init__
    self._inferred_dtype = self._validate(data)
  File "/home/ggama1/.local/lib/python3.7/site-packages/pandas/core/strings/accessor.py", line 217, in _validate
    raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!

Sorting a single sample VEP anotated VCF, based on hg38, gives no error.