Xinglab / espresso

Other
57 stars 4 forks source link

strange error ESPRESSO_S.pl #52

Closed Yumy526 closed 4 months ago

Yumy526 commented 5 months ago

the code and error is: perl ESPRESSO_S.pl -L samples.tsv -F mm10.fa -A mm10.ncbiRefSeq.gtf -O result [Wed Apr 24 22:07:14 2024] Loading reference Out of memory! Perl exited with active threads: 5 running and unjoined 0 finished and unjoined 0 running and detached

my samples.tsv: twocellsingle/diymatrix/sam/2cell_10A.sam 2cell_10 twocellsingle/diymatrix/sam/2cell_10ABsam 2cell_10

the sam files are from sorted bam

all looks ok,but the error happens.look forward to your kind reply please

EricKutschera commented 5 months ago

I think the Out of memory! error can happen if an old version of the Storable package is used. Here's a similar issue: https://github.com/Xinglab/espresso/issues/30#issuecomment-1822918509

Yumy526 commented 5 months ago

Then why did I try it with the reference data you provided, and there was no problem? If I succeed with the reference data, will I rule out the problem that the storable version is too low? Could this situation also be due to the fact that there are not enough versions?

? @.***

 

------------------ 原始邮件 ------------------ 发件人: "Xinglab/espresso" @.>; 发送时间: 2024年4月25日(星期四) 凌晨3:11 @.>; @.**@.>; 主题: Re: [Xinglab/espresso] strange error ESPRESSO_S.pl (Issue #52)

I think the Out of memory! error can happen if an old version of the Storable package is used. Here's a similar issue: #30 (comment)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Yumy526 commented 5 months ago

I'm now up to the required version of storable, and then the out of memory error happens again the error is: [Thu Apr 25 19:26:45 2024] Calculating how to assign 30 files into 5 threads [Thu Apr 25 19:26:45 2024] Loading reference Worker 0 begins to scan:  twocellsingle/diymatrix/sam/2cell_5A.sam  twocellsingle/diymatrix/sam/2cell_4B.sam  twocellsingle/diymatrix/sam/2cell_15B.sam  twocellsingle/diymatrix/sam/2cell_15A.sam  twocellsingle/diymatrix/sam/2cell_8A.sam  twocellsingle/diymatrix/sam/2cell_6A.sam Worker 1 begins to scan:   twocellsingle/diymatrix/sam/2cell_1A.sam  twocellsingle/diymatrix/sam/2cell_10B.sam  twocellsingle/diymatrix/sam/2cell_16B.sam  twocellsingle/diymatrix/sam/2cell_17B.sam  twocellsingle/diymatrix/sam/2cell_13A.sam  twocellsingle/diymatrix/sam/2cell_1B.sam Worker 2 begins to scan:   twocellsingle/diymatrix/sam/2cell_13B.sam  twocellsingle/diymatrix/sam/2cell_7B.sam  twocellsingle/diymatrix/sam/2cell_3A.sam  twocellsingle/diymatrix/sam/2cell_2A.sam  twocellsingle/diymatrix/sam/2cell_7A.sam  twocellsingle/diymatrix/sam/2cell_3B.sam Out of memory! Worker 3 begins to scan:   twocellsingle/diymatrix/sam/2cell_5B.sam  twocellsingle/diymatrix/sam/2cell_11A.sam  twocellsingle/diymatrix/sam/2cell_14A.sam  twocellsingle/diymatrix/sam/2cell_8B.sam  twocellsingle/diymatrix/sam/2cell_4A.sam  twocellsingle/diymatrix/sam/2cell_14B.sam Out of memory!

? @.***

 

------------------ 原始邮件 ------------------ 发件人: "Xinglab/espresso" @.>; 发送时间: 2024年4月25日(星期四) 凌晨3:11 @.>; @.**@.>; 主题: Re: [Xinglab/espresso] strange error ESPRESSO_S.pl (Issue #52)

I think the Out of memory! error can happen if an old version of the Storable package is used. Here's a similar issue: #30 (comment)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Yumy526 commented 5 months ago

I sincerely ask for your help hope for your reply thanks very much

? @.***

 

------------------ 原始邮件 ------------------ 发件人: "Xinglab/espresso" @.>; 发送时间: 2024年4月25日(星期四) 凌晨3:11 @.>; @.**@.>; 主题: Re: [Xinglab/espresso] strange error ESPRESSO_S.pl (Issue #52)

I think the Out of memory! error can happen if an old version of the Storable package is used. Here's a similar issue: #30 (comment)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

EricKutschera commented 5 months ago

It looks like the command made it further after you updated the Storable version. It could be that the command actually is using up all available memory. Can you watch the memory usage in htop (or another way) while the command is running? What versions of ESPRESSO, perl, and Storable are you using?

Yumy526 commented 5 months ago

Thank you for your attention. I just solved the problem by using just only one thread,too.  But another problem bothered me. I've tried it many times and it's been like this error.Is the crux of the matter my computer and server?

the code and error: $ perl /public/home/ymwang/tools/espresso_v_1_4_0/src/ESPRESSO_C.pl -I /public/home/ymwang/PacBio/7_isoform/result/4  -F /public/home/ymwang/PacBio/6_bambu/mm10.fa  -X 0 -T 8 [Mon Apr 29 04:34:09 2024] Loading splice junction info [Mon Apr 29 04:34:23 2024] Requesting system to split SAMLIST into 8 pieces  Divided SAM(LIST) sizes:  sam.list3aa           67780265  sam.list3ab           67780265  sam.list3ac           67780265  sam.list3ad           67780265  sam.list3ae           67780265  sam.list3af           67780265  sam.list3ag           67780265  sam.list3ah           67780259  SAM(LIST) was divided successfully.  First group of divided SAM(LIST) files:  sam.list3ah: 16924  sam.list3ag: 14329  sam.list3af: 11150  sam.list3ae: 7948  sam.list3ad: 7458  sam.list3ac: 4772  sam.list3ab: 2536  sam.list3aa: 0  First reads were recorded successfully for all 8 files. [Mon Apr 29 04:34:23 2024] Loading references [Mon Apr 29 04:34:49 2024] Scanning SAMLIST by 4 workers  Worker 1 begins to scan sam.list3aa. 0       0 5       0 6       0 7       0 9       10 10      0 11      0 12      11 14      16 16      0 17      0 19      12

Building a new DB, current time: 04/29/2024 04:34:54 New DB name:   /public/home/ymwang/PacBio/7_isoform/result/4/0/blast_19/current_db New DB title:  current_db Sequence type: Nucleotide Keep MBits: T Maximum file size: 1000000000B Adding sequences from FASTA; added 11 sequences in 0.00110912 seconds.

 Worker 2 begins to scan sam.list3ab. 2536    0 2537    0 2540    1 2541    57

..........

Building a new DB, current time: 04/29/2024 04:42:31 New DB name:   /public/home/ymwang/PacBio/7_isoform/result/4/0/blast_8064/current_db New DB title:  current_db Sequence type: Nucleotide Deleted existing Nucleotide BLAST database named /public/home/ymwang/PacBio/7_isoform/result/4/0/blast_8064/current_db Keep MBits: T Maximum file size: 1000000000B 187     1 Adding sequences from FASTA; added 2 sequences in 0.0284631 seconds.

Thread 3 terminated abnormally: Failed to run sort --buffer-size=2G --numeric-sort --output=/public/home/ymwang/PacBio/7_isoform/result/4/0/blast_4983/SJ_group.fa /public/home/ymwang/PacBio/7_isoform/result/4/0/blast_4983/SJ_group.fa.tmp. Exit code is -1 at /public/home/ymwang/tools/espresso_v_1_4_0/src/ESPRESSO_C.pl line 1981. 188     6

.......... Building a new DB, current time: 04/29/2024 04:42:54 New DB name:   /public/home/ymwang/PacBio/7_isoform/result/4/0/blast_14352/current_db New DB title:  current_db Sequence type: Nucleotide Keep MBits: T Maximum file size: 1000000000B Adding sequences from FASTA; added 1 sequences in 0.000335932 seconds.

 Worker 8 begins to scan sam.list3ah. Terminating worker 8. Terminating worker 7. Terminating worker 5. Terminating worker 4. Terminating worker 2. Terminating worker 1. Worker 4 not responding. Exiting due to error in worker thread Perl exited with active threads:         1 running and unjoined         0 finished and unjoined         0 running and detached

I would like to ask you, there are two things, how to express them, I tried it, but it didn't work

? @.***

 

------------------ 原始邮件 ------------------ 发件人: "Xinglab/espresso" @.>; 发送时间: 2024年4月25日(星期四) 晚上10:48 @.>; @.**@.>; 主题: Re: [Xinglab/espresso] strange error ESPRESSO_S.pl (Issue #52)

It looks like the command made it further after you updated the Storable version. It could be that the command actually is using up all available memory. Can you watch the memory usage in htop (or another way) while the command is running? What versions of ESPRESSO, perl, and Storable are you using?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

EricKutschera commented 5 months ago

The main error looks like:

Thread 3 terminated abnormally: Failed to run sort --buffer-size=2G --numeric-sort --output=/public/home/ymwang/PacBio/7_isoform/result/4/0/blast_4983/SJ_group.fa /public/home/ymwang/PacBio/7_isoform/result/4/0/blast_4983/SJ_group.fa.tmp. Exit code is -1 at /public/home/ymwang/tools/espresso_v_1_4_0/src/ESPRESSO_C.pl line 1981.

https://github.com/Xinglab/espresso/blob/v1.4.0/src/ESPRESSO_C.pl#L1981

I'm not sure why sort would fail. You could try running the command directly from a terminal to see if you get a better error message: sort --buffer-size=2G --numeric-sort --output=/public/home/ymwang/PacBio/7_isoform/result/4/0/blast_4983/SJ_group.fa /public/home/ymwang/PacBio/7_isoform/result/4/0/blast_4983/SJ_group.fa.tmp

Based on your previous posts the issue might be running out of memory. From the README the estimate for memory usage in the C step with 8 threads is at least 32GB: https://github.com/Xinglab/espresso/tree/v1.4.0?tab=readme-ov-file#usage

Yumy526 commented 5 months ago

Thank you so much.  I've already  run a sample. But I still have TargetID(1~30) to run, I would like to ask how this is reflected in the command? May be perl /public/home/ymwang/tools/espresso_v_1_4_0/src/ESPRESSO_C.pl -I /public/home/ymwang/PacBio/7_isoform/testmy/  -F /public/home/ymwang/PacBio/6_bambu/mm10.fa  -X 1,2,3,4,..,30  -T 5

I would like to know how the 30 target IDs are represented in the -X parameter at once Or is it possible to run command  30 times with a single ID?I tried it and it was a bit of a hassle Looking forward to hearing from you

? @.***

 

------------------ 原始邮件 ------------------ 发件人: "Xinglab/espresso" @.>; 发送时间: 2024年4月29日(星期一) 晚上9:42 @.>; @.**@.>; 主题: Re: [Xinglab/espresso] strange error ESPRESSO_S.pl (Issue #52)

The main error looks like: Thread 3 terminated abnormally: Failed to run sort --buffer-size=2G --numeric-sort --output=/public/home/ymwang/PacBio/7_isoform/result/4/0/blast_4983/SJ_group.fa /public/home/ymwang/PacBio/7_isoform/result/4/0/blast_4983/SJ_group.fa.tmp. Exit code is -1 at /public/home/ymwang/tools/espresso_v_1_4_0/src/ESPRESSO_C.pl line 1981.
https://github.com/Xinglab/espresso/blob/v1.4.0/src/ESPRESSO_C.pl#L1981

I'm not sure why sort would fail. You could try running the command directly from a terminal to see if you get a better error message: sort --buffer-size=2G --numeric-sort --output=/public/home/ymwang/PacBio/7_isoform/result/4/0/blast_4983/SJ_group.fa /public/home/ymwang/PacBio/7_isoform/result/4/0/blast_4983/SJ_group.fa.tmp

Based on your previous posts the issue might be running out of memory. From the README the estimate for memory usage in the C step with 8 threads is at least 32GB: https://github.com/Xinglab/espresso/tree/v1.4.0?tab=readme-ov-file#usage

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

EricKutschera commented 5 months ago

From https://github.com/Xinglab/espresso/tree/v1.4.0?tab=readme-ov-file#basic-usage

If there are multiple inputs then ESPRESSO_C needs to be run once for each input. The -X parameter identifies which input is being processed (-X 0, -X 1, ...)

You could make a script to run each -X value sequentially. Also, there's a snakemake workflow that can run all steps of ESPRESSO: https://github.com/Xinglab/espresso/tree/v1.4.0/snakemake#espresso-snakemake

You can edit the snakemake profile config to run the jobs locally instead of submitting to a scheduler: https://github.com/Xinglab/espresso/blob/v1.4.0/snakemake/snakemake_profile/config.yaml#L1

Yumy526 commented 4 months ago

Thank you very much for your kind reply, it is my pleasure