Closed richelbilderbeek closed 4 years ago
Let's see if a merge works:
p230198@peregrine:bbbq_1 sbatch run_long_r_script.sh merge_all_counts.R
Submitted batch job 13570125
p230198@peregrine:bbbq_1 cd ../../peregrine/scripts/
p230198@peregrine:scripts sbatch --dependency=afterany:13570125 email_me.sh
Submitted batch job 13570127
p230198@peregrine:bbbq_1 cat $(ls | egrep "13570125")
Rscript merge_all_counts.R
/var/spool/slurmd/job13570125/slurm_script: line 17: 19056 Killed Rscript "$@"
slurmstepd: error: Detected 1 oom-kill event(s) in step 13570125.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
###############################################################################
Peregrine Cluster
Job 13570125 for user 'p230198'
Finished at: Tue Sep 8 08:32:57 CEST 2020
Job details:
============
Name : run_long_r_script
User : p230198
Partition : regular
Nodes : pg-node210
Cores : 1
State : OUT_OF_MEMORY
Submit : 2020-09-08T08:19:42
Start : 2020-09-08T08:20:06
End : 2020-09-08T08:32:57
Reserved walltime : 10-00:00:00
Used walltime : 00:12:51
Used CPU time : 00:02:26 (efficiency: 19.05%)
% User (Computation): 80.98%
% System (I/O) : 19.02%
Mem reserved : 1G/node
Max Mem used : 883.90M (pg-node210)
Max Disk Write : 112.64K (pg-node210)
Max Disk Read : 1.82M (pg-node210)
Acknowledgements:
=================
Please see this page for information about acknowledging Peregrine in your publications:
https://wiki.hpc.rug.nl/peregrine/additional_information/scientific_output
################################################################################
Use bash file that uses 10GB (instead of 1):
p230198@peregrine:bbbq_1 sbatch run_long_hard_r_script.sh merge_all_counts.R
Submitted batch job 13572852
p230198@peregrine:bbbq_1 cd ../../peregrine/scripts
p230198@peregrine:scripts sbatch --dependency=afterany:13572852 email_me.sh
Submitted batch job 13572877
10GB is not enough:
p230198@peregrine:~ jobinfo 13572852
Name : run_long_hard_r_script
User : p230198
Partition : regular
Nodes : pg-node207
Cores : 1
State : OUT_OF_MEMORY
Submit : 2020-09-08T09:20:56
Start : 2020-09-08T09:36:34
End : 2020-09-08T13:19:42
Reserved walltime : 10-00:00:00
Used walltime : 03:43:08
Used CPU time : 00:28:23 (efficiency: 12.72%)
% User (Computation): 65.42%
% System (I/O) : 34.58%
Mem reserved : 10G/node
Max Mem used : 10.00G (pg-node207)
Max Disk Write : 112.64K (pg-node207)
Max Disk Read : 1.82M (pg-node207)
Now with 100 GB:
p230198@peregrine:bbbq_1 sbatch run_long_hard_r_script.sh merge_all_counts.R
Submitted batch job 13583764
p230198@peregrine:bbbq_1 cd ../../peregrine/scripts
p230198@peregrine:scripts sbatch --dependency=afterany:13583764 email_me.sh
Submitted batch job 13583765
Still pending after 22 hours:
p230198@peregrine:~ jobinfo 13583764
Name : run_long_hard_r_script
User : p230198
Partition : regular
Nodes : None assigned
Cores : 1
State : PENDING (Priority) ((null))
Submit : 2020-09-08T13:27:59
Start : --
End : --
Reserved walltime : 10-00:00:00
Used walltime : --
Used CPU time : --
% User (Computation): --
% System (I/O) : --
Mem reserved : 100G/node
Max Mem used : 0.00 ()
Max Disk Write : 0.00 ()
Max Disk Read : 0.00 ()
MHCnuggets SARS-CoV-2 values:
Merging all values takes 3 hours:
p230198@peregrine:bbbq_1 cat $(ls | egrep 13583764)
Rscript merge_all_counts.R
Warning message:
`data_frame()` is deprecated as of tibble 1.1.0.
Please use `tibble()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.
###############################################################################
Peregrine Cluster
Job 13583764 for user 'p230198'
Finished at: Wed Sep 9 18:01:13 CEST 2020
Job details:
============
Name : run_long_hard_r_script
User : p230198
Partition : regular
Nodes : pg-node022
Cores : 1
State : RUNNING
Submit : 2020-09-08T13:27:59
Start : 2020-09-09T14:38:34
End : --
Reserved walltime : 10-00:00:00
Used walltime : 03:22:39
Used CPU time : --
% User (Computation): --
% System (I/O) : --
Mem reserved : 100G/node
Max Mem used : 0.00 ()
Max Disk Write : 0.00 ()
Max Disk Read : 0.00 ()
Acknowledgements:
=================
Please see this page for information about acknowledging Peregrine in your publications:
https://wiki.hpc.rug.nl/peregrine/additional_information/scientific_output
################################################################################
This is done, in bbbq_1.
Because it uses MHCnuggets, which will be looked at in another paper, this Issue will be closed.
The SARS-CoV-2 runs are done:
There are 14 proteins in the reference proteome, and (13 + 21 = ) 34 haplotypes. 14 times 34 = 476.
Time to analyse these :+1: