Closed Bio1nform closed 1 year ago
Hi, thanks for the interest in wgd v2
! Could you please try with python=3.6/3.7/3.8 and see if you would meet the same error? Besides, using pip install wgd
is also an option.
hi heche-psb,
Thanks, I managed to install it in 3.8, and is working now. I am confused with some steps, in the example: 1) wgd dmd Aquilegia_coerulea (Is Aquilegia_coerulea folder or fasta? Including multiple fasta files did work for me) 2) I am not able to get the families for the downstream analysis, which output from wgd dmd is families?
Thank you.
Hi, The Aquilegia_coerulea
is the file name of cds sequence. If you provide only 1 cds file, it will calculate the whole paranome. With more than 1 cds files, it will calculate global, local MRBH or just pairwise RBH according to the other options you set. The family output file of wgd dmd Aquilegia_coerulea
is Aquilegia_coerulea.tsv
in the default output folder wgd_dmd
, indicating the paralogous family. The workflow can be glanced at here.
Hi, Thanks for the information. So i can use the MRBH as a family for the next steps. I ran wgd dmd Aquilegia_coerulea, and got Aquilegia_coerulea.tsv.
I a getting error with : wgd ksd wgd_dmd/Aquilegia_coerulea.tsv Aquilegia_coerulea
I think it might be due to the version of PAML
. Are you using PAML
v4.9j? This version shall work fine.
I am using PAML v4.9j still same error.
Could you please provide me with your input file? I will check if I may meet the same error.
Here is the input fasta file. Aquilegia_coerulea.zip
I tried on your input file with command "wgd ksd wgd_dmd/Aquilegia_coerulea.txt.tsv Aquilegia_coerulea.txt" and it works. Maybe you could try it again? Thanks a lot.
Hi am getting error running this too.
wgd ksd wgd_globalmrbh/global_MRBH.tsv --extraparanomeks wgd_ksd/Aquilegia_coerulea.tsv.ks.tsv -sp speciestree.nw --reweight -o wgd_globalmrbh_ks --spair "Aquilegia_coerulea;Protea_cynaroides" --spair "Aquilegia_coerulea;Vitis_vinifera" --spair "Aquilegia_coerulea;Acorus_americanus" --spair "Aquilegia_coerulea;Aquilegia_coerulea" --plotkde --nthreads 90 Aquilegia_coerulea Protea_cynaroides Acorus_americanus Vitis_vinifera
Help Please, Thanks
Hi, I am stuck in here, any help would be really appreciated.
Kindly help Please. Thank you
Hi, please install the latest version here in the github repository. The error information shows that you're using an older version that might have bugs which I have already fixed later.
Hi Thanks for the information.
I installed wgd==2.0.20 version in python3.6.
wgd dmd --globalmrbh Aquilegia_coerulea Protea_cynaroides Acorus_americanus Vitis_vinifera -o wgd_globalmrbh
Now i get these errors:
File "/home/.conda/envs/wgdV2/bin/wgd", line 10, in
Thanks you
With python3.8:
wgd dmd --globalmrbh Aquilegia_coerulea Protea_cynaroides Acorus_americanus Vitis_vinifera -o wgd_globalmrbh Works fine
Rest does not work
wgd ksd wgd_globalmrbh_CHECK/global_MRBH.tsv --extraparanomeks wgd_ksd/Aquilegia_coerulea.tsv.ks.tsv -sp speciestree.nw --reweight -o wgd_globalmrbh_ks1 --spair "Aquilegia_coerulea;Protea_cynaroides" --spair "Aquilegia_coerulea;Vitis_vinifera" --spair "Aquilegia_coerulea;Acorus_americanus" --spair "Aquilegia_coerulea;Aquilegia_coerulea" --plotkde Aquilegia_coerulea Protea_cynaroides Acorus_americanus Vitis_vinifera
Thank you.
The error occurred at the this step kde = stats.gaussian_kde(y,weights=w,bw_method=0.1)
, which returned ValueError: array must not contain infs or NaNs.
The problem is between the Ks file, species pairs and the species tree. I guess it might be due to some incorrect inputs. Could you please share me with all your input and the used full command. Thanks!
Hi, I ran the following steps. Here are the sequence files: Sequences1.tar.gz Sequences2.tar.gz
1) wgd ksd
wgd_dmd/Aquilegia_coerulea.tsv Aquilegia_coerulea wgd_ksd.zip
2) wgd dmd --g lobalmrbh Aquilegia_coerulea Protea_cynaroides Acorus_americanus Vitis_vinifera -o wgd_globalmrbh
3) wgd ksd wgd_globalmrbh/global_MRBH.tsv --extraparanomeks wgd_ksd/Aquilegia_coerulea.tsv.ks.tsv -sp speciestree.nw --reweight -o wgd_globalmrbh_ks1 --spair "Aquilegia_coerulea;Protea_cynaroides" --spair "Aquilegia_coerulea;Vitis_vinifera" --spair "Aquilegia_coerulea;Acorus_americanus" --spair "Aquilegia_coerulea;Aquilegia_coerulea" --plotkde Aquilegia_coerulea Protea_cynaroides Acorus_americanus Vitis_vinifera
I get the error message.
Thanks
Hi,
Were you able to take a look into it?
Thanks,
Hi, yes, there was a small bug concerning the node-averaged Ks processing. I just fixed it and pushed it as a v2.0.21
. Please try again and let me know if the same error occurred again. Thanks a lot!
Hi, I am still getting error. job.error.txt
Hi, I think you're using wgd ksd
to do the rate correction. Could you use wgd ksd
to only infer Ks while using wgd viz
to do the rate correction?
Thanks it seems to work, i will let you know if there is any issue in this.
However, wgd dmd --globalmrbh i think it cannot handle sequence larger than 4kb. Is there any way i can increase the size? I am not sure what is the reason though.
I get the following error.
Could you please share me with the sequence files that you used? It seems to be a format problem of the input cds files.
I figured the error. It was because of duplicated fasta IDs. Is there any way to fix this issue?
How ever, some new error. wgd ksd wgd_dmd/Aquilegia_coerulea.tsv Aquilegia_coerulea -o wgd_ksd gives me following error. error2.txt
The problem occurred at the family GF00000004
. Could you please share me with the cds files of only GF00000004
.
Here you go. GF00000004 AQUCO_02000225v1_22779, AQUCO_00700295v1_9186, AQUCO_02000226v1_22780, AQUCO_44500001v1_40958, AQUCO_02000219v1_22773, AQUCO_00201007v1_3153, AQUCO_02200206v1_24148, AQUCO_04400082v1_33100, AQUCO_01000140v1_12868, AQUCO_01000578v1_13622, AQUCO_00700296v1_9187, AQUCO_00200174v1_1785, AQUCO_00700480v1_9470, AQUCO_01000288v1_13135, AQUCO_00200153v1_1743, AQUCO_00900240v1_11191, AQUCO_00200176v1_1787, AQUCO_01500005v1_18737, AQUCO_04700045v1_33731, AQUCO_01700324v1_20762, AQUCO_00700388v1_9339, AQUCO_01700327v1_20765, AQUCO_00700386v1_9337, AQUCO_02600114v1_25914, AQUCO_00500170v1_7139, AQUCO_00900126v1_11012, AQUCO_00200172v1_1783, AQUCO_03500197v1_30294, AQUCO_03900098v1_31723, AQUCO_02200275v1_24279, AQUCO_00300143v1_4243, AQUCO_03000303v1_28394, AQUCO_00200152v1_1742, AQUCO_00200155v1_1745, AQUCO_00400487v1_6293, AQUCO_29600001v1_40929, AQUCO_00300388v1_4628, AQUCO_00300391v1_4632, AQUCO_29600002v1_40930, AQUCO_04700046v1_33732, AQUCO_00200173v1_1784, AQUCO_00300142v1_4242, AQUCO_00200311v1_2025, AQUCO_11400002v1_40231, AQUCO_00200175v1_1786, AQUCO_04900044v1_33977, AQUCO_02300181v1_24692, AQUCO_00500129v1_7067, AQUCO_02300178v1_24689, AQUCO_01600057v1_19597, AQUCO_02300171v1_24678, AQUCO_02300176v1_24687, AQUCO_02300176v1_24686, AQUCO_02600115v1_25915, AQUCO_02300180v1_24691, AQUCO_00500071v1_6981, AQUCO_02800270v1_27588, AQUCO_00300386v1_4626, AQUCO_01600365v1_20054, AQUCO_00300389v1_4629, AQUCO_02500191v1_25430, AQUCO_00500071v1_6982, AQUCO_02900102v1_27824, AQUCO_02400108v1_24976, AQUCO_00100892v1_1485, AQUCO_01700027v1_20240, AQUCO_03700288v1_31162, AQUCO_07500004v1_37851, AQUCO_00500071v1_6983, AQUCO_00100478v1_811, AQUCO_00700290v1_9176, AQUCO_00300387v1_4627, AQUCO_07500004v1_37852, AQUCO_03700288v1_31161, AQUCO_00700293v1_9182, AQUCO_02300179v1_24690, AQUCO_00500071v1_6984, AQUCO_02300173v1_24683
I can actually run this family through successfully with command wgd ksd famback.tsv GF00000004.cds
. Were you using PAML v4.9j
and did you add other parameters?
I am using paml 4.9. i tried wgd ksd GF0000000.tsv GF00000004 -o TEST i get the following error.
(WGDV2_38) geno@farm:\~/WGD/Aquilegia$ which codeml
/home/software/GENOMETOOLS/PAML/paml4.9j/bin/codeml
(WGDV2_38) geno@farm:~/WGD/Aquilegia$ wgd ksd GF0000000.tsv GF00000004 -o TEST
09:40:34 INFO This is wgd v2.0.21 cli.py:32
09:40:36 INFO tmpdir = wgdtmp_f2922255-bc2a-445e-826b-0fe0ce138647 cli.py:483
Traceback (most recent call last):
File "/home/.conda/envs/WGDV2_38/bin/wgd", line 10, in
I used another dataset and reproduced this error. It's because of the error in codeml "166 columns are converted into ??? because of stop codons. 21 out of 21 sequences do not have any resolved nucleotides. Giving up." It's the sequence that is not a strict cds and contains many in-frame stop codons so despite the stripped alignment length is not zero, no codeml result is returned. I just pushed a fixed commit so it should be solved now.
Did you push it to new version? I re-installed wgd2==2.0.21. I am still getting same error.
(wgdV2_38) geno@farm:~/WGD/Aquilegia$ wgd ksd test.tsv GF00000004 -o TEST gives me the same error.
adding --cds gives me following error:
(wgdV2_38) geno@farm:~/WGD/Aquilegia$ wgd ksd --cds test.tsv GF00000004 -o TEST
07:50:05 INFO This is wgd v2.0.21 cli.py:32
07:50:06 WARNING Translation error (First codon 'AAG' is not a start codon) in seq AQUCO_00100478v1_811 core.py:282
WARNING Translation error (First codon 'TTA' is not a start codon) in seq AQUCO_00200173v1_1784 core.py:282
WARNING Translation error (First codon 'TTA' is not a start codon) in seq AQUCO_00200175v1_1786 core.py:282
WARNING Translation error (Final codon 'GTT' is not a stop codon) in seq AQUCO_00300391v1_4632 core.py:282
WARNING Translation error (First codon 'GCC' is not a start codon) in seq AQUCO_04900044v1_33977 core.py:282
WARNING Translation error (Final codon 'ACT' is not a stop codon) in seq AQUCO_29600002v1_40930 core.py:282
INFO tmpdir = wgdtmp_aaa2e529-da43-4116-b682-7bb00d94135e cli.py:483
Traceback (most recent call last):
File "/home/.conda/envs/wgdV2_38/bin/wgd", line 10, in
Hi, you need to download and install the version from this github repository. I fixed some bugs later after the PYPI version v2.0.21
. Sorry for the confusion.
Hi,
Others are working now. I am still getting error when i add --plotapgmm.
wgd viz -d wgd_globalmrbh_ks/global_MRBH.tsv.ks.tsv --extraparanomeks wgd_ksd/Aquilegia_coerulea.tsv.ks.tsv -sp speciestree.nw --reweight -ap wgd_syn/iadhore-out/anchorpoints.txt -o wgd_viz_mixed_Ks_elmm2 --spair "Aquilegia_coerulea;Protea_cynaroides" --spair "Aquilegia_coerulea;Vitis_vinifera" --spair "Aquilegia_coerulea;Acorus_americanus" --spair "Aquilegia_coerulea;Aquilegia_coerulea" --gsmap wgd_globalmrbh_ks/gene_species.map --plotkde --plotelmm --plotapgmm
Traceback (most recent call last):
File "/home/.conda/envs/wgdV2_38/bin/wgd", line 10, in
Thanks
Hi, could you first check if the gene ids in wgd_ksd/Aquilegia_coerulea.tsv.ks.tsv
match exactly with the gene ids in wgd_syn/iadhore-out/anchorpoints.txt
. It seems the anchor Ks
data is not properly extracted as expected.
Hi, I am getting this error with the .22 version.
wgd syn -f mRNA -a Name wgd_dmd/Aquilegia_coerulea.tsv Aquilegia_coerulea.gff3 -ks wgd_ksd/Aquilegia_coerulea.tsv.ks.tsv -o wgd_sync
File "/home/.conda/envs/wgdV2_38/bin/wgd", line 10, in
Thanks
Could you show me the column name of the file wgd_ksd/Aquilegia_coerulea.tsv.ks.tsv
? Was it produced from v1 or v2?
pair family g1 g2 gene1 gene2 Aqcoe1G119100.1Aqcoe3G278800.1 GF00000001 Aquilegia_coerulea_40904 Aquilegia_coerulea_26689 Aqcoe3G278800.1 Aqcoe1G119100.1 Aqcoe2G086700.1Aqcoe3G278800.1 GF00000001 Aquilegia_coerulea_40904 Aquilegia_coerulea_04209 Aqcoe3G278800.1 Aqcoe2G086700.1
This was produced from wgd2==2.0.22
It seems that there is no Ks results in your wgd_ksd/Aquilegia_coerulea.tsv.ks.tsv
. Your run of last step using wgd ksd
is problematic. May I have a look of the log file of the wgd ksd
step that produced this file wgd_ksd/Aquilegia_coerulea.tsv.ks.tsv
?
The wgd2==2.0.21 works, Its the wgd2==2.0.22 that shows error.
Wgd2==2.0.21 error. myjob. wgd2==2.0.21.txt
Wgd2==2.0.22 error. myjoberror wgd2==2.0.22.txt
Hi, it's a bug in 2.0.22
. I fixed it already in this repository. The fixed version is on 2.0.23
now. Sorry for the confusion.
Thanks, I will try it. Is it updated on conda version too??
The update on conda will be a bit late. It's on PYPI now.
I get issues installing (PYPI) in cluster. Installing Numpy, fastcluster give me issues.
pip install numpy==1.19.0 Collecting numpy==1.19.0 Using cached numpy-1.19.0.zip (7.3 MB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... error error: subprocess-exited-with-error
× Preparing metadata (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [58 lines of output] Running from numpy source directory.
Hi, please try with python3.8
. python3.10
is not compatible for now.
This error:
python setup.py install Traceback (most recent call last):
File "setup.py", line 5, in
Hi, you may try with sudo apt-get install libffi-dev
or update.
I am working in cluster and do not have the admin privilege. The previous error was with: python setup.py install
With : python -m pip install -r requirements.txt I get following error.
WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/attrs/ WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/attrs/ WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/attrs/ WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/attrs/ WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/attrs/ Could not fetch URL https://pypi.org/simple/attrs/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/attrs/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping ERROR: Could not find a version that satisfies the requirement attrs==20.3.0 (from -r requirements.txt (line 1)) (from versions: none) ERROR: No matching distribution found for attrs==20.3.0 (from -r requirements.txt (line 1)) WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping
Above you were installing from the cloned repository right? Normally python3.8
works. Did you install with virtual environment?
This error is from the virtual environment.
The error messages indicate there are issues with the installed python which should have SSL support but apparently not. This is not a problem of wgd v2 itself I guess.
No, not the issue with wgd V2. I am trying to install the V==2.1.23 as of now. Once i install i will check for the bug that you have updated. How long will the conda update take? conda version works best for me.
I installed the V==2.0.23. in PYPI.
wgd ksd wgd_dmd/Aquilegia_coerulea.tsv Aquilegia_coerulea -o wgd_ksd
15:20:53 INFO This is wgd v2.0.23 cli.py:32
15:21:10 INFO tmpdir = cli.py:483
wgdtmp_adab2f7f-fb6b-407e-8083-5643b9b4a9fc
15:21:14 INFO Analysing family GF00000001 core.py:2873
15:21:14 INFO Analysing family GF00000002 core.py:2873
15:21:14 INFO Analysing family GF00000003 core.py:2873
15:21:14 INFO Analysing family GF00000004 core.py:2873
15:21:15 INFO Analysing family GF00000005 core.py:2873
Now i get the following error. error.txt
Hi, This is great tool, i have used version 1. Now working with version2. I managed to install with conda, however i am getting following error
wgd -h Usage: wgd [OPTIONS] COMMAND [ARGS]... wgd v2 - Copyright (C) 2023-2024 Hengchi Chen Contact: heche@psb.vib-ugent.be Options: -v, --verbosity [info|debug] Verbosity level, default = info. -h, --help Show this message and exit. Commands: dmd All-vs-all diamond blastp + MCL clustering. focus Multiply species RBH or c-score defined orthologous family's gene... ksd Paranome and one-to-one ortholog Ks distribution inference... mix Mixture modeling of Ks distributions. peak Infer peak and CI of Ks distribution. syn Co-linearity and anchor inference using I-ADHoRe. viz Visualization of Ks distribution or synteny
wgd dmd 09:04:59 INFO This is wgd v1.2 cli.py:32 Traceback (most recent call last): File "/home/.conda/envs/WGD/bin/wgd", line 10, in
sys.exit(cli())
File "/home/.local/lib/python3.6/site-packages/click/core.py", line 829, in call
return self.main(args, kwargs)
File "/home/.local/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/.local/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/.local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/home/.local/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(args, kwargs)
File "/home/.conda/envs/WGD/lib/python3.6/site-packages/cli.py", line 113, in dmd
_dmd(kwargs)
File "/home/.conda/envs/WGD/lib/python3.6/site-packages/cli.py", line 116, in _dmd
from wgd.core import SequenceData, read_MultiRBH_gene_families,mrbh,ortho_infer,genes2fams,endt,segmentsaps,bsog
ModuleNotFoundError: No module named 'wgd.core'
wgd viz 09:05:19 INFO This is wgd v1.2 cli.py:32 Traceback (most recent call last): File "/home/.conda/envs/WGD/bin/wgd", line 10, in
sys.exit(cli())
File "/home/.local/lib/python3.6/site-packages/click/core.py", line 829, in call
return self.main(args, kwargs)
File "/home/.local/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/.local/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/.local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/home/.local/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(args, kwargs)
File "/home/.conda/envs/WGD/lib/python3.6/site-packages/cli.py", line 533, in viz
_viz(kwargs)
File "/home/.conda/envs/WGD/lib/python3.6/site-packages/cli.py", line 536, in _viz
from wgd.viz import elmm_plot, apply_filters, multi_sp_plot, default_plot,all_dotplots,filter_by_minlength,dotplotunitgene,dotplotingene,filter_mingenumber
ImportError: cannot import name 'elmm_plot'
Any help would be great. Thanks