Closed reedus-123 closed 3 years ago
Hi, could you send me your input files please? I would need the phenotype.pheno
file, the first few lines of the snp.merged.vcf
file (say, the first 5000 lines) and maybe whatever you were using as a distance matrix. This way I can more quickly inspect this error. Thanks!
Hi, thanks for your quick response!
I have attached the snp.merged.vcf header and .pheno file below.
Hi, sorry but I can't seem to be able to bgzip and index your VCF file:
$ cp merged.txt merged.vcf
$ bgzip -c merged.vcf > merged.vcf.gz
$ tabix -p vcf merged.vcf.gz
[E::get_intv] Failed to parse TBX_VCF, was wrong -p [type] used?
The offending line was: "AL111168.1 44 . G A 222 . VDB=0.150727;AF1=1;AC1=2;MQ=60;FQ=-282;RPB=-0.905927;PV4=1,0.38,1,0.26;DP=17679;DP4=1,0,15659,1434;AN=104;AC=104 GT:PL:GQ ./.:.:. ./.:.:. ./.:.:. ./.:.:. ./.:.:. ./.:.:. ./.:.:. ./.:.:. 1/1:255,255,0:99 ./.:.:. 1/1:255,255,0:99 ./.:.:. ./.:.:. 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99"
[E::get_intv] Failed to parse TBX_VCF, was wrong -p [type] used?
The offending line was: "AL111168.1 45 . T C 222 . VDB=0.515746;AF1=1;AC1=2;MQ=60;FQ=-282;RPB=1.7221;PV4=1,1,0.48,1;DP=21500;DP4=2,0,19212,1491;AN=126;AC=126 GT:PL:GQ 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99"
[E::hts_idx_push] Unsorted positions on sequence #1: 1 followed by 0
tbx_index_build failed: merged.vcf.gz
Does it work for you?
It does for me, yes. I use bcftools:
$ bcftools index merged.vcf
$ bgzip merged.vcf
The first command generates a .csi index file and then I zip the vcf file using bgzip.
Hi, sorry but that does not seem to work for me:
$ bcftools index merged.vcf
index: the file is not BGZF compressed, cannot index: merged.vcf
I am using bcftools 1.9.
At any rate, can you provide another example set that reproduces the error please?
Hi mgalardini, is the issue solved? I am trying the tool, and received exactly the same error. Thanks.
So my assumption is that the problem lays outside of pyseer. If you can provide an example dataset that triggers the error I can try to reproduce it again
So my assumption is that the problem lays outside of pyseer. If you can provide an example dataset that triggers the error I can try to reproduce it again
Thank you very much, mgalardini.
The attached please find the pheno and vcf (head 500) data I tested. The command is pyseer --phenotypes $phenotype --vcf $vcff --no-distances --lineage --print-samples > test_SNPs.txt
I don't have a distance matrix so I used --no-distance option. Actually, when I used your demo data snps.vcf.gz and resistances.pheno, the save error was reproduced. It looks a distance matrix must be included? does it accept a .nwk tree generated by hclust? or how can I run without it?
The whole error info is quoted below:
Read 73 phenotypes
Detected binary phenotype
Writing lineage effects to lineage_effects.txt
Traceback (most recent call last):
File "/miniconda3/envs/pyseer/bin/pyseer", line 10, in
So, if you omit the --lineage
option the command executes fine. To have the program find lineage effects you have to provide your own lineages using the --lineage-clusters
argument. By default the distance matrix is used to define lineages but in this case you are not providing such matrix. Please look into the tutorial to see how tot generate one.
So, if you omit the
--lineage
option the command executes fine. To have the program find lineage effects you have to provide your own lineages using the--lineage-clusters
argument. By default the distance matrix is used to define lineages but in this case you are not providing such matrix. Please look into the tutorial to see how tot generate one.
Perfect, thanks a lot for your time and support, mgalardini.
Hello,
I tried to run the following command to calculate results from a merged vcf file with the --no-distances parameter (as an alternative to --load-m, which yielded perfectly separable data). This returned the following error:
pyseer --phenotypes phenotype.pheno --vcf snp.merged.vcf --no-distances --lineage --print-samples > my_SNPs.txt Read 28 phenotypes Detected binary phenotype Writing lineage effects to lineage_effects.txt [W::bcf_hdr_check_sanity] GL should be declared as Number=G Traceback (most recent call last): File "/install/software/mincondas/3.7/pyseer/bin/pyseer", line 10, in <module> sys.exit(main()) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/pyseer/__main__.py", line 759, in main ret = fixed_effects_regression(*data) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/pyseer/model.py", line 350, in fixed_effects_regression max_lineage = fit_lineage_effect(lin, c, k) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/pyseer/model.py", line 172, in fit_lineage_effect lineage_mod = smf.Logit(k, X) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/statsmodels/discrete/discrete_model.py", line 456, in __init__ super().__init__(endog, exog, check_rank, **kwargs) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/statsmodels/discrete/discrete_model.py", line 175, in __init__ super().__init__(endog, exog, **kwargs) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/statsmodels/base/model.py", line 237, in __init__ super(LikelihoodModel, self).__init__(endog, exog, **kwargs) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/statsmodels/base/model.py", line 78, in __init__ **kwargs) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/statsmodels/base/model.py", line 101, in _handle_data data = handle_data(endog, exog, missing, hasconst, **kwargs) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/statsmodels/base/data.py", line 673, in handle_data **kwargs) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/statsmodels/base/data.py", line 87, in __init__ self._handle_constant(hasconst) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/statsmodels/base/data.py", line 131, in _handle_constant exog_max = np.max(self.exog, axis=0) File "<__array_function__ internals>", line 6, in amax File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2706, in amax keepdims=keepdims, initial=initial, where=where) File "/install/software/mincondas/3.7/pyseer/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 87, in _wrapreduction return ufunc.reduce(obj, axis, dtype, out, **passkwargs) ValueError: zero-size array to reduction operation maximum which has no identity
Any help would be appreciated. Thank you.