Closed ilaydagulmez closed 3 weeks ago
Hi, it seems something unusual in your sequence file 20628_9.cds.fasta
. Could you have a check at that temporary directory for the result of diamond
?
Did you mean should I diamond
for each fasta file separately before run dmd
?
Hi, I mean could you check that temporary directory for 20628_9.cds.fasta
, given its error message. There might be something unusual there.
Every tmp directory has the same files for each fasta file.
The fact that your run didn't call diamond
properly is probably due to the same cause as your last issue with i-adhore
. Could you run wgd dmd 20628_9.cds.fasta
successfully?
Yes, wgd dmd 20628_9.cds.fasta
was done before.
Hi, could you try your original command calling --globalmrbh
but without 20628_9.cds.fasta
?
The same error occurred with another fasta
file, even though it had previously worked fine on its own.
I see. May you share me with your data? I will try to reproduce your error.
Thanks for your kindness and help. Here are my five individual cds
data. And another not solved question, wgd syn
succeeded before this pipeline but still did not work with wgd peak
. Again, without syn
output files, peak does work.
https://transfer.adttemp.com.br/rGm9q/transfersh-58772.zip
Hi, yes, wgd peak
works with whole paranome too. I guess your failed run with globalmrbh
is due to the big file size of genomes being used. A quick suggestion, could you increase the memory for the globalmrbh
job and run again?
Hi, should I increase the memory by adding a parameter to dmd
, I asked this cause I didn't see such a parameter option.
Hi, it should be set when you submit your job to the calculation node of your HPC in your job script. wgd
doesn't have options to set the memory for jobs.
Hi, I tried with high core and got the same error.
How many memory did you give?
The partition has 28 cores and 128 GB memory. I intended to run the globalmrbh
because the wgd peak
couldn't be generated with the syn
output. Perhaps there's no need to run a global
analysis after all if the peak problem could solved. (https://github.com/heche-psb/wgd/issues/37)
This is what I got using your data, using 20Gb memory and 1 thread.
2024-06-08 20:44:15 INFO This is wgd v2.0.37 cli.py:34
INFO Checking cores and threads... core.py:35
INFO The number of logical CPUs/Hyper core.py:36
Threading in the system: 24
INFO The number of physical cores in the core.py:37
system: 6
INFO The number of actually usable CPUs in core.py:38
the system: 2
INFO Checking memory... core.py:40
INFO Total physical memory: 251.3222 GB core.py:41
INFO Available memory: 221.4534 GB core.py:42
INFO Free memory: 40.6646 GB core.py:43
2024-06-08 20:52:52 INFO tmpdir = cli.py:125
wgdtmp_0e5b0d7b-a558-4f9c-9f65-177b6da58
07f for 20628-1.cds.fasta
INFO tmpdir = cli.py:125
wgdtmp_92941af3-a3ca-485d-b9a5-03e0c2c4a
e11 for 20628-8.cds.fasta
INFO tmpdir = cli.py:125
wgdtmp_cd78a9ae-b303-4d37-9da8-b2be38bec
235 for 20628-9.cds.fasta
INFO tmpdir = cli.py:125
wgdtmp_707bf0ec-6937-42b9-9afa-2de3139d6
2b9 for 20896-1.cds.fasta
INFO tmpdir = cli.py:125
wgdtmp_950f4072-9cd2-4465-84cd-34e558e1d
787 for 20896-2.cds.fasta
INFO Multiple cds files: will compute core.py:875
globalMRBH orthologs or cscore-defined
homologs regardless of focal species
INFO Note that setting the number of threads core.py:879
as 10 is the most efficient
INFO 20628-1.cds.fasta vs. 20628-8.cds.fasta core.py:848
2024-06-08 21:14:27 INFO Normalization between 20628-1.cds.fasta core.py:406
& 20628-8.cds.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
2024-06-08 21:37:38 INFO 20628-1.cds.fasta vs. 20628-9.cds.fasta core.py:848
2024-06-08 21:56:40 INFO Normalization between 20628-1.cds.fasta core.py:406
& 20628-9.cds.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
2024-06-08 22:19:20 INFO 20628-1.cds.fasta vs. 20896-1.cds.fasta core.py:848
2024-06-08 22:49:05 INFO Normalization between 20628-1.cds.fasta core.py:406
& 20896-1.cds.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
2024-06-08 23:41:54 INFO 20628-1.cds.fasta vs. 20896-2.cds.fasta core.py:848
2024-06-09 00:08:09 INFO Normalization between 20628-1.cds.fasta core.py:406
& 20896-2.cds.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
2024-06-09 00:37:19 INFO 20628-8.cds.fasta vs. 20628-9.cds.fasta core.py:848
2024-06-09 00:55:26 INFO Normalization between 20628-8.cds.fasta core.py:406
& 20628-9.cds.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
2024-06-09 01:17:44 INFO 20628-8.cds.fasta vs. 20896-1.cds.fasta core.py:848
2024-06-09 01:47:32 INFO Normalization between 20628-8.cds.fasta core.py:406
& 20896-1.cds.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
2024-06-09 02:43:37 INFO 20628-8.cds.fasta vs. 20896-2.cds.fasta core.py:848
2024-06-09 03:10:32 INFO Normalization between 20628-8.cds.fasta core.py:406
& 20896-2.cds.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
2024-06-09 03:42:17 INFO 20628-9.cds.fasta vs. 20896-1.cds.fasta core.py:848
2024-06-09 04:09:34 INFO Normalization between 20628-9.cds.fasta core.py:406
& 20896-1.cds.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
2024-06-09 05:03:57 INFO 20628-9.cds.fasta vs. 20896-2.cds.fasta core.py:848
2024-06-09 05:28:34 INFO Normalization between 20628-9.cds.fasta core.py:406
& 20896-2.cds.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
2024-06-09 05:58:17 INFO 20896-1.cds.fasta vs. 20896-2.cds.fasta core.py:848
2024-06-09 06:32:42 INFO Normalization between 20896-1.cds.fasta core.py:406
& 20896-2.cds.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
2024-06-09 07:05:10 INFO Total run time: 620.91 minutes core.py:1637
INFO Done core.py:1638
The command is:
wgd dmd --globalmrbh -o wgd_globalmrbh_2 20628-1.cds.fasta 20628-8.cds.fasta 20628-9.cds.fasta 20896-1.cds.fasta 20896-2.cds.fasta -n 1
Hi, thanks for your help. I tried to get the more effective gene prediction and CDS file and run dmd
from the beginning for just one file. But got the error:
File: helixer.fasta.txt
Thanks.
Okey I got it, it's happens when header are the same. Solved!
Hi, I tried for my 5 individual species for the
wgd dmd --globalmrbh
. I got the error like this:My commands are like this:
wgd dmd --globalmrbh
20628_8.cds.fasta 20628_9.cds.fasta 20628_1.cds.fasta 20896_1.cds.fasta 20896_2.cds.fasta -o globalAny suggestion for this error? Thanks for your time.