richelbilderbeek / bbbq_article

The Bianchi Bilderbeek Bogaart Question answered
GNU General Public License v3.0
0 stars 0 forks source link

bbbq_1: finish SARS-CoV-2 (MHCnuggets) #79

Closed richelbilderbeek closed 4 years ago

richelbilderbeek commented 4 years ago

The SARS-CoV-2 runs are done:

p230198@peregrine:bbbq_1 ls | egrep "covid_h" | wc --lines
476

There are 14 proteins in the reference proteome, and (13 + 21 = ) 34 haplotypes. 14 times 34 = 476.

Time to analyse these :+1:

richelbilderbeek commented 4 years ago

Let's see if a merge works:

p230198@peregrine:bbbq_1 sbatch run_long_r_script.sh merge_all_counts.R 
Submitted batch job 13570125
p230198@peregrine:bbbq_1 cd ../../peregrine/scripts/
p230198@peregrine:scripts sbatch --dependency=afterany:13570125 email_me.sh 
Submitted batch job 13570127
richelbilderbeek commented 4 years ago
p230198@peregrine:bbbq_1 cat $(ls | egrep "13570125")
Rscript merge_all_counts.R
/var/spool/slurmd/job13570125/slurm_script: line 17: 19056 Killed                  Rscript "$@"
slurmstepd: error: Detected 1 oom-kill event(s) in step 13570125.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.

###############################################################################
Peregrine Cluster
Job 13570125 for user 'p230198'
Finished at: Tue Sep  8 08:32:57 CEST 2020

Job details:
============

Name                : run_long_r_script
User                : p230198
Partition           : regular
Nodes               : pg-node210
Cores               : 1
State               : OUT_OF_MEMORY
Submit              : 2020-09-08T08:19:42
Start               : 2020-09-08T08:20:06
End                 : 2020-09-08T08:32:57
Reserved walltime   : 10-00:00:00
Used walltime       :    00:12:51
Used CPU time       :    00:02:26 (efficiency: 19.05%)
% User (Computation): 80.98%
% System (I/O)      : 19.02%
Mem reserved        : 1G/node
Max Mem used        : 883.90M (pg-node210)
Max Disk Write      : 112.64K (pg-node210)
Max Disk Read       : 1.82M (pg-node210)

Acknowledgements:
=================

Please see this page for information about acknowledging Peregrine in your publications:

https://wiki.hpc.rug.nl/peregrine/additional_information/scientific_output

################################################################################
richelbilderbeek commented 4 years ago

Use bash file that uses 10GB (instead of 1):

p230198@peregrine:bbbq_1 sbatch run_long_hard_r_script.sh merge_all_counts.R
Submitted batch job 13572852
p230198@peregrine:bbbq_1 cd ../../peregrine/scripts
p230198@peregrine:scripts sbatch --dependency=afterany:13572852 email_me.sh 
Submitted batch job 13572877
richelbilderbeek commented 4 years ago

10GB is not enough:

p230198@peregrine:~ jobinfo 13572852
Name                : run_long_hard_r_script
User                : p230198
Partition           : regular
Nodes               : pg-node207
Cores               : 1
State               : OUT_OF_MEMORY
Submit              : 2020-09-08T09:20:56
Start               : 2020-09-08T09:36:34
End                 : 2020-09-08T13:19:42
Reserved walltime   : 10-00:00:00
Used walltime       :    03:43:08
Used CPU time       :    00:28:23 (efficiency: 12.72%)
% User (Computation): 65.42%
% System (I/O)      : 34.58%
Mem reserved        : 10G/node
Max Mem used        : 10.00G (pg-node207)
Max Disk Write      : 112.64K (pg-node207)
Max Disk Read       : 1.82M (pg-node207)
richelbilderbeek commented 4 years ago

Now with 100 GB:

p230198@peregrine:bbbq_1 sbatch run_long_hard_r_script.sh merge_all_counts.R
Submitted batch job 13583764
p230198@peregrine:bbbq_1 cd ../../peregrine/scripts
p230198@peregrine:scripts sbatch --dependency=afterany:13583764 email_me.sh 
Submitted batch job 13583765
richelbilderbeek commented 4 years ago

Still pending after 22 hours:

p230198@peregrine:~ jobinfo 13583764
Name                : run_long_hard_r_script
User                : p230198
Partition           : regular
Nodes               : None assigned
Cores               : 1
State               : PENDING (Priority) ((null))
Submit              : 2020-09-08T13:27:59
Start               : --
End                 : --
Reserved walltime   : 10-00:00:00
Used walltime       : --
Used CPU time       : --
% User (Computation): --
% System (I/O)      : --
Mem reserved        : 100G/node
Max Mem used        : 0.00  ()
Max Disk Write      : 0.00  ()
Max Disk Read       : 0.00  ()
richelbilderbeek commented 4 years ago

MHCnuggets SARS-CoV-2 values:

bbbq_1_covid_only.zip

richelbilderbeek commented 4 years ago

Merging all values takes 3 hours:

p230198@peregrine:bbbq_1 cat $(ls | egrep 13583764)
Rscript merge_all_counts.R
Warning message:
`data_frame()` is deprecated as of tibble 1.1.0.
Please use `tibble()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 

###############################################################################
Peregrine Cluster
Job 13583764 for user 'p230198'
Finished at: Wed Sep  9 18:01:13 CEST 2020

Job details:
============

Name                : run_long_hard_r_script
User                : p230198
Partition           : regular
Nodes               : pg-node022
Cores               : 1
State               : RUNNING
Submit              : 2020-09-08T13:27:59
Start               : 2020-09-09T14:38:34
End                 : --
Reserved walltime   : 10-00:00:00
Used walltime       :    03:22:39
Used CPU time       : --
% User (Computation): --
% System (I/O)      : --
Mem reserved        : 100G/node
Max Mem used        : 0.00  ()
Max Disk Write      : 0.00  ()
Max Disk Read       : 0.00  ()

Acknowledgements:
=================

Please see this page for information about acknowledging Peregrine in your publications:

https://wiki.hpc.rug.nl/peregrine/additional_information/scientific_output

################################################################################
richelbilderbeek commented 4 years ago

This is done, in bbbq_1.

Because it uses MHCnuggets, which will be looked at in another paper, this Issue will be closed.