Closed ZhaoHang-bio closed 1 month ago
Hi, how many genes are there in your two genome files and did you try to run it with only 1 thread? just in case memory or multiple threads got stucked somewhere on your HPC.
Thank you for your reply. I haven't tried using only one thread yet. Will the program run faster this way? I can give it a try. It seems that the program is indeed stuck, and the output file has not been updated for a long time.
I tried to run this command with one thread, but it got stuck at the same step. Here is the last part of the output content:
INFO tmpdir = cli.py:125
wgdtmp_94666321-3546-4b24-8089-966efd788
bd7 for F1.longestcds.fasta
INFO tmpdir = cli.py:125
wgdtmp_e5383a39-e5e5-4782-a5f9-8c8fbc57b
151 for F2.longestcds.fasta
INFO Multiple cds files: will compute RBH cli.py:152
orthologs
INFO Note that setting the number of threads cli.py:153
as 1 is the most efficient
2024-09-26 16:02:21 INFO F1.longestcds.fasta vs. core.py:869
F2.longestcds.fasta
Hi, how many genes are there in your two genome files?
F1 file has 47k genes, and F2 file has 37k genes.
Hi, have you tried to run it without the "--cds" flag? The issue is likely with your cds files. Please add you full log.
Okay, I have removed the --cds option. The current command line is "wgd dmd F-GA-02.correctcds.fasta F-GA-20.correctcds.fasta -e 1e-8 -o 01-F-GA-02_F-GA-20_out". The following is the content of the running log:
[2024-09-26 17:54:42][INFO][4837193.login3] In prologue, JOB_ID is 4837193.login3
[2024-09-26 17:54:42][INFO][4837193.login3] In prologue, the pbs_server_name is login3.
login3
[2024-09-26 17:54:59][INFO][4837193.login3] ######################## Job 4837193.login3 start to execute pre.sh! ##########################
[2024-09-26 17:54:59][INFO][4837193.login3] Goldenable & storageenable both are false, job ran without making reserve!
Running on host comput107
PID 171353
Start Time is Thu Sep 26 17:54:44 CST 2024
Directory is /vol3/agis/chengshifeng_group/zhaohang/0--C4/000--Chr/F-GA/08--nucmer/12-wgd/wf2-SARFL
-------------------------------------------------------------------
2024-09-26 17:54:51 INFO This is wgd v2.0.38 cli.py:34
INFO Checking cores and threads... core.py:35
INFO The number of logical CPUs/Hyper core.py:36
Threading in the system: 40
INFO The number of physical cores in the core.py:37
system: 20
INFO The number of actually usable CPUs in core.py:38
the system: 40
INFO Checking memory... core.py:40
INFO Total physical memory: 502.3862 GB core.py:41
INFO Available memory: 440.6701 GB core.py:42
INFO Free memory: 441.5800 GB core.py:43
2024-09-26 17:55:18 INFO tmpdir = cli.py:125
wgdtmp_f39cd9f8-9302-4da5-b1f9-b5b670598
08b for F-GA-02.correctcds.fasta
INFO tmpdir = cli.py:125
wgdtmp_55379bf0-32a4-4bac-8ae8-444994772
4d5 for F-GA-20.correctcds.fasta
INFO Multiple cds files: will compute RBH cli.py:152
orthologs
INFO Note that setting the number of threads cli.py:153
as 1 is the most efficient
2024-09-26 17:55:22 INFO F-GA-02.correctcds.fasta vs. core.py:869
F-GA-20.correctcds.fasta
And I found that there is an output file F-GA-02.correctcds.fasta_F-GA-20.correctcds.fasta.tsv in the temporary directory, but the ID has been changed and cannot be directly used. Why is it stuck at the step of obtaining the output file?
Hi, could you do one more test using the egu1000.fasta and ugi1000.fasta file in the test folder, wgd dmd egu1000.fasta ugi1000.fasta
? That way we know the issue is with the sequence file instead of the system.
Hi, I have run the test file and obtained the output file egu1000.fasta_ugi1000.fasta.rbh.tsv:
ugi1000.fasta egu1000.fasta
UGI.ctg09892.24687.1 Migut.A00145.1
UGI.ctg10665.24762.1 Migut.A00013.1
UGI.ctg09962.24699.1 Migut.A00047.1
UGI.ctg10656.24760.1 Migut.A00180.1
UGI.ctg10758.24771.1 Migut.A00053.1
UGI.ctg12673.24916.1 Migut.A00185.1
UGI.ctg12243.24886.1 Migut.A00392.1
UGI.ctg13622.24985.1 Migut.A00497.1
UGI.ctg17671.25173.1 Migut.A00002.1
UGI.ctg16042.25099.1 Migut.A00302.1
UGI.ctg15695.25079.1 Migut.A00039.1
UGI.ctg12602.24907.1 Migut.A00215.1
UGI.ctg10656.24761.1 Migut.A00253.1
UGI.ctg14150.25014.1 Migut.A00290.1
UGI.ctg16172.25102.1 Migut.A00268.1
but, I got an error
/opt/gridview//pbs/dispatcher/mom_priv/jobs/4837977.login3.SC: line 27: 210766 Segmentation fault (core dumped) wgd dmd egu1000.fasta ugi1000.fasta -o 01-test_out
the run log:
-------------------------------------------------------------------
2024-09-27 10:25:34 INFO This is wgd v2.0.38 cli.py:34
INFO Checking cores and threads... core.py:35
INFO The number of logical CPUs/Hyper core.py:36
Threading in the system: 40
INFO The number of physical cores in the core.py:37
system: 20
INFO The number of actually usable CPUs in core.py:38
the system: 40
INFO Checking memory... core.py:40
INFO Total physical memory: 187.4059 GB core.py:41
INFO Available memory: 91.3262 GB core.py:42
INFO Free memory: 87.7809 GB core.py:43
INFO tmpdir = cli.py:125
wgdtmp_d62bc600-37e6-423b-ac55-f95ec318e
db4 for egu1000.fasta
INFO tmpdir = cli.py:125
wgdtmp_fa1a5e76-73a0-4988-85b5-f1e0a4c5d
00f for ugi1000.fasta
INFO Multiple cds files: will compute RBH cli.py:152
orthologs
INFO Note that setting the number of threads cli.py:153
as 1 is the most efficient
2024-09-27 10:25:35 INFO egu1000.fasta vs. ugi1000.fasta core.py:869
2024-09-27 10:25:37 INFO Normalization between egu1000.fasta & core.py:406
ugi1000.fasta
INFO 100 bins & upper 5% hits in linear core.py:407
regression
INFO The number of hits is less than 100, core.py:176
will aggregate all the hits in one bin
-------------------------------------------------------------------
Hi, it's uncommon to have the core dumped
error on the small size test data. It has something to do with your HPC.
Is it normal that my running command
wgd dmd --cds F1.longestcds.fasta F2.longestcds.fasta -e 1e-8 -o 01-F1_F2_out -n 20
has been running for more than 23 hours?