Open mikmaksi opened 4 years ago
I need some more information to help you:
Of course
/storage/mikhail/062620_dumpSTR_chrX_tshoot
1_filter_with_dumpSTR.sh
: launcher scriptdata/raw/chrX.vcf
: input data1_filter_with_dumpSTR.sh
I routinely get the same dumpSTR error message, using gangSTR v2.4.6 and dumpSTR v3.0.2 (as well as with earlier versions) to test 37 known pathogenic STR loci, 5 of which are on chrX. If I remove the chrX variants from the gangSTR output before providing it to dumpSTR, I do not get the error. I would also vote for handling of chrX by dumpSTR.
Now trying on gangSTR v2.5.0 and dumpSTR v4.0.0, running on 3 samples with 2 males and 1 female, and where gangSTR output includes calls on chrX, I get the following error from dumpSTR:
Traceback (most recent call last):
File "~/bin/dumpSTR", line 33, in <module>
sys.exit(load_entry_point('trtools==4.0.0', 'console_scripts', 'dumpSTR')())
File "~/lib/python3.6/site-packages/trtools-4.0.0-py3.6.egg/trtools/dumpSTR/dumpSTR.py", line 1245, in run
retcode = main(args)
File "~/lib/python3.6/site-packages/trtools-4.0.0-py3.6.egg/trtools/dumpSTR/dumpSTR.py", line 1183, in main
record = ApplyCallFilters(record, call_filters, sample_info, invcf.samples)
File "~/lib/python3.6/site-packages/trtools-4.0.0-py3.6.egg/trtools/dumpSTR/dumpSTR.py", line 569, in ApplyCallFilters
filt_output = filt(record)
File "~/lib/python3.6/site-packages/trtools-4.0.0-py3.6.egg/trtools/dumpSTR/filters.py", line 706, in __call__
ci = np.stack(ci)
File "<__array_function__ internals>", line 6, in stack
File "~/lib/python3.6/site-packages/numpy-1.18.1-py3.6-linux-x86_64.egg/numpy/core/shape_base.py", line 426, in stack
raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape
If I delete the chrX variants from the GangSTR output, then DumpSTR runs fine.
Incidentally, Fragile X Syndrome and Kennedy disease are 2 common X-linked repeat expansion disorders, so it would be nice to be able to find these by gangSTR & dumpSTR.
I am getting a similar error with chrX. If i remove chrX from the vcf file, dumpSTR runs smoothly. if i include, i get this error.
Traceback (most recent call last): File "/project/jcreminslab/kenrc_projects/softwares.2/TRTools/venv.1/bin/dumpSTR", line 8, in <module> sys.exit(run()) File "/project/jcreminslab/kenrc_projects/softwares.2/TRTools/venv.1/lib/python3.6/site-packages/trtools/dumpSTR/dumpSTR.py", line 1245, in run retcode = main(args) File "/project/jcreminslab/kenrc_projects/softwares.2/TRTools/venv.1/lib/python3.6/site-packages/trtools/dumpSTR/dumpSTR.py", line 1204, in main record.vcfrecord.INFO['HWEP'] = utils.GetHardyWeinbergBinomialTest(allele_freqs, genotype_counts) File "/project/jcreminslab/kenrc_projects/softwares.2/TRTools/venv.1/lib/python3.6/site-packages/trtools/utils/utils.py", line 312, in GetHardyWeinbergBinomialTest if gt[1] not in allele_freqs.keys(): IndexError: tuple index out of range
i use these commands:
GangSTR --bam input.bam --ref hg38.fa --regions hg38_ver13.bed --out input --bam-samps input --samp-sex M
dumpSTR --vcf input.vcf --out out.5 --gangstr-min-call-DP 5 --gangstr-filter-spanbound-only --gangstr-filter-badCI --gangstr-max-call-DP 1000 --gangstr-min-call-Q 0.6
I'm having the same error as @rckeerthivasan when having variants on chrX on males My settings:
ganstr_command = "GangSTR"\
+ " --bam " + bamfile + " " + "--ref " + ref \
+ " --regions regions/my_panel.tsv"\
+ " --samp-sex " + patient_sex\
+ " --bam-samps " + patient\
+ " --out " + output_dir + "/" + file_name \
+ " --output-readinfo"\
+ " --nonuniform"\
+ " --include-ggl"
dumpstr_command = "dumpSTR" \
+ " --vcf " + gzvcf_name\
+ " --out " + filtered_dir + "/" + file_name\
+ " --gangstr-min-call-DP 20"\
+ " --gangstr-max-call-DP 1000"\
+ " --gangstr-filter-spanbound-only"\
+ " --gangstr-filter-badCI"\
+ " --zip"\
+ " --drop-filtered"
The error:
Traceback (most recent call last):
File "/home/oxana/.local/bin/dumpSTR", line 8, in <module>
sys.exit(run())
File "/home/oxana/.local/lib/python3.8/site-packages/trtools/dumpSTR/dumpSTR.py", line 1245, in run
retcode = main(args)
File "/home/oxana/.local/lib/python3.8/site-packages/trtools/dumpSTR/dumpSTR.py", line 1204, in main
record.vcfrecord.INFO['HWEP'] = utils.GetHardyWeinbergBinomialTest(allele_freqs, genotype_counts)
File "/home/oxana/.local/lib/python3.8/site-packages/trtools/utils/utils.py", line 312, in GetHardyWeinbergBinomialTest
if gt[1] not in allele_freqs.keys():
IndexError: tuple index out of range
Do you have any ideas for a workaround before it's fixed?
Running it on a tumor sample, so high variation is probable, including copy numbers and ploidy
Only workaround I have is to delete chrX variants from the gangSTR output before passing to dumpSTR. It's not ideal considering there are several X-linked repeat expansion disorders, but at least it allows dumpSTR to run on the rest of the genome, and also verifies that this is the correct problem. Interesting that we're all getting different errors:
TypeError: 'int' object is not iterable
ValueError: all input arrays must have the same shape
IndexError: tuple index out of range
Hello,
I encountered the following error when running DumpSTR on a vcf produced by GangSTR that had only chrX calls for 3 samples. Two of the samples are female and have 2 comma separated values in their REPCN field, while 1 sample is male and has a single value in its REPCN field
Thanks so much!