PacificBiosciences / trgt

Tandem repeat genotyping and visualization from PacBio HiFi data
Other
103 stars 7 forks source link

TRGTdb #6

Closed ACEnglish closed 3 months ago

ACEnglish commented 1 year ago

Adding code for coverting a TRGT output VCF into a database. See tdb_tutorial.md for usage details.

TODOs:

git clone <trgt>
cd trgt/
#Making an isolated virtualenv for trgt
python3 -m venv mtrgtpy  
source mtrgtpy/bin/activate
# manually installing truvari v4.0-dev
cd ../truvari/
python3 -m pip install .
cd -
# installing trgt and its dependencies
python3 -m pip install .
python3 -m pip freeze | wc -l
zqfang commented 6 months ago

Hi @ACEnglish,

I got an issue runing the following cmd

trgt db create -o strains.tdb trgt_out/SJL.sorted.vcf.gz   

The error message is ValueError: cannot reindex on an axis with duplicate labels

Traceback (most recent call last):
  File "/home/fangzq/.conda/envs/trgt/bin/trgt", line 33, in <module>
    sys.exit(load_entry_point('TRGT', 'console_scripts', 'trgt')())
  File "/home/fangzq/github/trgt/trgt/__main__.py", line 44, in main
    CMDS[args.cmd](args.options)
  File "/home/fangzq/github/trgt/trgt/dbcmds.py", line 28, in db_main
    CMDS[args.cmd][1](args.options)
  File "/home/fangzq/github/trgt/trgt/database/create.py", line 74, in create_main
    n_data = trgt.load_tdb(i) if i.rstrip('/').endswith(".tdb") else trgt.vcf_to_tdb(i)
  File "/home/fangzq/github/trgt/trgt/database/dbutils.py", line 209, in vcf_to_tdb
    allele_df = pull_alleles(data)
  File "/home/fangzq/github/trgt/trgt/database/dbutils.py", line 156, in pull_alleles
    alleles["LocusID"] = data["LocusID"]

Do you have any preprocessing step for importing trgt output to trgtdb ?

my trgt cmd is

./trgt-v0.8.0-linux_x86_64 --genome mm10.fa --repeats tr_catalog.adjusted.mm10.bed --reads SJL.aligned.sorted.bam --output-prefix trgt_out/SJL --threads 6
ACEnglish commented 3 months ago

Database tool has been refactored and placed into a repository at https://github.com/ACEnglish/tdb.

@zqfang - Please try from that repository and if the error still happens, open a ticket there.