Closed ilaydagulmez closed 5 months ago
Just a quick check, is there a gene id in your sample.cds.fasta.tsv
exactly "g1"? And did you generate the sample.cds.fasta.tsv
using wgd v2
too? Can you make sure that each line (except for the ones starting with "#") in your gff3
file is separated by tab and has in total 9 columns?
Yes, I generatedsample.cds.fasta.tsv
using wgd v2
and my gff3 file has 9 columns as I shared. The file which is generated with the command dmd
doesn't start "g1", here is the file:
But as I wrote, those inputs are the same. I don't understand why there is a problem like this.
Thanks for your time
Hi, the gene ids in your family file need to be the same as presented in your gff3
file, which is apparently not the case in your dataset. Your gene ids are like "g46456.t1", "g23037.t1", instead of "g1". You have to make the gff3
file contain the same ids. Otherwise the program can't discern which gene is at which position.
Yes I know and understand but already the cds file which is the input for dmd
was generated from Augustus
so I mean it's the same input file.
Thanks
I don't get it. You mean that the cds
file and gff3
file were both generated from Augustus
but had different gene ids?
Yes cds
file and the gff3
file were both generated from Augustus
even though the dmd
output was as I shared.
I see. Could you please share me with the correct family file and gff3 file? I will try to reproduce your error.
Okey, thank you. Here is my gff3
file:
https://transfer.adttemp.com.br/EDzFs/20628-8.gff
my family
file from dmd
:
20628_8.cds.fasta.tsv.txt
Thanks for your help!
Hi, you just need to add the option -f transcript
and -a ID
so as to make it run successfully. Could you please try again?
What a great news! Thanks for your help, I will try and (hope) close the issue! 😊
Hi again, am I install i-adhore
separately? I think this parameter (-f transcript and -a ID) will run but now I got the FileNotFoundError: [Errno 2] No such file or directory: 'i-adhore'
error. Thanks!
Hi, you need to install i-adhore
before hand, for which you could refer to https://github.com/VIB-PSB/i-ADHoRe.
May I add to the path of the i-adhore
way as a parameter? I installed it but still had the same error. When I added the path, got theIndexError: list index out of range
.
Thanks
If you type in i-adhore
and enter, can you get the informaion "Usage: i-adhore [configuration file]" as a return? That means you have successfully installed the software.
Yes I get this.
This is what I got using your data. You just need to make sure i-adhore v3
is in your environment path and can be properly called.
(ENV_wgd) (base) heche@HengchiChen$ wgd syn -f transcript -a ID 20628_8.cds.fasta.tsv.txt 20628-8.gff -o debug_syn
2024-05-29 14:47:29 INFO This is wgd v2.0.38 cli.py:34
INFO Checking cores and threads... core.py:35
INFO The number of logical CPUs/Hyper Threading in the system: 8 core.py:36
INFO The number of physical cores in the system: 4 core.py:37
INFO The number of actually usable CPUs in the system: 8 core.py:38
INFO Checking memory... core.py:40
INFO Total physical memory: 7.6480 GB core.py:41
INFO Available memory: 1.1672 GB core.py:42
INFO Free memory: 0.9874 GB core.py:43
2024-05-29 14:47:32 INFO Configuring I-ADHoRe co-linearity search cli.py:703
INFO Writing families file syn.py:96
INFO Writing gene lists syn.py:98
2024-05-29 14:47:39 INFO Writing config file syn.py:100
2024-05-29 14:49:08 INFO Running I-ADHoRe cli.py:707
2024-05-29 14:49:10 WARNING WARNING: Maximum allowed number of gaps in the alignment not specified. Setting to cluster_gap. syn.py:188
WARNING: Tandem gap size not correct in settings file. Using default (gap_size / 2)
INFO syn.py:189
This is i-ADHoRe v3.0.
Copyright (c) 2002-2010, Flanders Interuniversity Institute for Biotechnology, VIB.
Algorithm designed by Klaas Vandepoele, Cedric Simillion, Jan Fostier, Dieter De Witte,
Koen Janssens, Sebastian Proost, Yvan Saeys and Yves Van de Peer.
Process 1/1 is alive on HengchiChen.
************* i-ADHoRe parameters *************
Number of genelists = 2823
Blast table =
/mnt/c/Users/hengc/wgdating_package/ud_wgd/wgd_debug/debug_0524/debug_syn/families.tsv
Output path =
/mnt/c/Users/hengc/wgdating_package/ud_wgd/wgd_debug/debug_0524/debug_syn/iadhore-out/
Gap size = 30
Cluster gap size = 35
Cloud gap size = 0
Cloud cluster gap size = 0
Max gaps in alignment = 35
Tandem gap = 15
Flush output = 1000
Q-value = 0.75
Anchorpoints = 3
Probability cutoff = 0.01
Cloud filtering method = Binomial
Level 2 only = false
Use family = true
Write statistics = false
Alignment method = GreedyGraphbased4
Multiple hypothesis correction = FDR
Number of threads = 4
Compare aligners = false
Collinear searches only
Visualize GHM.png = false
Visualize Alignment = false
Verbose output = true
************ END i-AdDHoRe parameters *********
Creating dataset... done. (time: 0.434661s)
Mapping gene families... done. (time: 0.0517709s)
Remapping tandem duplicates... done. (time: 0.00573397s)
Writing genelists file... done. (time: 0.041543s)
Collinear Search
Level 2 multiplicon detection... done. (time: 0.929245s)
Profile detection...
1 multiplicons to evaluate - evaluating level 2 multiplicon... 0 new multiplicons found.
Flushing output files...done.
Time for Higher Level Detection: 0.003685s.
All Done! Bye...
INFO Processing I-ADHoRe output cli.py:711
INFO `minlen` not set, taking 10% of longest scaffold (7869.900000000001) for 20628_8.cds.fasta viz.py:2714
INFO Dropped 80189 scaffolds in 20628_8.cds.fasta because they are on scaffolds shorter than viz.py:2716
7869.900000000001
2024-05-29 14:49:27 INFO Making Syndepth plot viz.py:2753
ERROR No eligible multiplicon discovered in terms of segment length and/or gene number! viz.py:1357
INFO Total run time: 1.97 minutes core.py:1643
INFO Done core.py:1644
Hi again, thanks for your help! As I saidi-adhore installed and got the Usage: i-adhore [configuration file]. But it's run only in i-adhore/build/src folder so wgd syn did not see i-adhore normally. That's why I asked if any command or parameter for add i-adhore path. So is it possible to write to.pyor command for path?
Thanks!
Hi, did you try with command like export PATH="$PATH:i-adhore/build/src/i-adhore"
to write the path of binary file into your environment variable? So far there is no option for users to give path to the binary file as to i-adhore
but it might be a good suggestion had that i-adhore
was repeatedly complained for uneasy installation.
Hi, thanks for your advice. I tried and I think it works:
But still got the same error:
Hi, apparently the node at which your job ran didn't have i-adhore
properly installed but your local node did. I think it has something to do with your HPC system. Did you try to run wgd syn
locally, by which I mean not submiting it to the calculation node?
Yes got the same error again :(
Hi, I have just added the parameter of the path to i-adhore
executable in wgd syn
. You may install the latest commit from this repository and try the command below.
$ wgd syn -f transcript -a ID 20628_8.cds.fasta.tsv.txt 20628-8.gff -o test_path-iadhore --pathiadhore /i-adhore-3.0.01/build/i-adhore/bin/i-adhore
Hi, what a great news! Many thanks for your solution! I reinstalled2.0.38
, am I did wrong?
Hi, you may install wgd
from source using the command below.
git clone https://github.com/heche-psb/wgd
cd wgd
virtualenv -p=python3 ENV (or python3 -m venv ENV)
source ENV/bin/activate
pip install numpy==1.19.0
pip install -r requirements.txt
pip install .
It's done! Thank you so much for everything and forgive me if I tired you :))
Hi, I wrote about another step. I successfully finished others so thank you for your help!
Little question about step
syn
. I have a gff3 file from the Augustus , but when I run the command:wgd syn
/wgd_dmd/sample.cds.fasta.tsv /Augustus/sample.gffgff3 file looks like:
I got the error like this:
Actually it's the same cds.fasta input for gff and
wgd
I didn't understand, so do you have any suggestions for getting the right gff3 file?Many thanks for your time and help!
İlayda