Closed cgpu closed 4 years ago
@lmurba @angarb fyi, we might need help for executing some commands like du
to collect info on average space per process type for the sumner runs, but we will explicitly ask so when needed and provide implementation details.
@lmurba has provided info on this, pasting below:
# laura's data
4.4M ./00/
1.8G ./01/
4.3M ./03/
523M ./04/
45G ./05/
23G ./07/
26G ./08/
28G ./0c/
21G ./10/
30M ./11/
528M ./13/
1.3G ./15/
526M ./16/
3.1G ./17/
4.3M ./18/
530M ./19/
4.3M ./1b/
24G ./1d/
19G ./20/
531M ./23/
4.3M ./26/
18G ./27/
4.4M ./29/
1.3G ./2d/
4.3M ./2f/
511M ./30/
1.2G ./34/
40G ./37/
1.1G ./38/
31G ./3b/
4.3M ./3c/
19G ./3f/
530M ./40/
23M ./42/
26G ./45/
1.1G ./46/
4.3M ./4a/
32G ./4c/
27G ./52/
528M ./56/
54G ./58/
21G ./59/
531M ./5d/
4.3M ./5e/
19G ./5f/
4.3M ./62/
4.3M ./63/
1.4G ./65/
54G ./66/
21G ./67/
4.4M ./6e/
4.3M ./6f/
27G ./79/
4.3M ./7b/
20G ./7e/
13M ./7f/
23G ./80/
4.3M ./83/
1.1G ./84/
30G ./85/
20G ./88/
517M ./8b/
24G ./8d/
29G ./8e/
4.4M ./91/
523M ./97/
530M ./9c/
1.1G ./9d/
28G ./9e/
4.4M ./a3/
4.3M ./a7/
23G ./a8/
21G ./aa/
4.4M ./ab/
4.3M ./ae/
537M ./b0/
518M ./b1/
526M ./b3/
4.3M ./b5/
22G ./b7/
532M ./ba/
21G ./bc/
4.3M ./bf/
1.2G ./c0/
29G ./c5/
19G ./c8/
4.3M ./cb/
4.3M ./cc/
526M ./ce/
4.3M ./d1/
33G ./d2/
30G ./d7/
33G ./e1/
18G ./e6/
28G ./e7/
4.3M ./e8/
25G ./e9/
27G ./ea/
19G ./ed/
20G ./ef/
8.6M ./f6/
1.4G ./f8/
4.3M ./f9/
22G ./fa/
26G ./fd/
1.1G ./fe/
1.4G ./0d/
8.5G ./15/
656M ./17/
4.3M ./2b/
10M ./2f/
43G ./34/
17G ./36/
4.3M ./52/
390M ./56/
4.3M ./5a/
13G ./69/
29G ./7d/
24G ./8a/
8.5G ./91/
16G ./95/
474M ./a5/
369M ./d0/
4.3M ./d4/
26G ./d8/
4.3M ./da/
5.9G ./df/
6.6M ./e1/
422M ./f5/
4.3M ./f6/
9.3G ./fa/
12G ./fb/
list_of_all_files_3_tcga_samples.txt list_of_all_files_24_laura_samples.txt
So for allocating disk size, this is a bit tricky as there are 3 places to set this:
lifeSciences.bootDiskSize +
process {
withName: 'star' {
disk = "350 GB"
}
will be used NOT for the master node but for google lifesciences machines.
So, the master node must be set via GUI from CloudOS, and everything in the config addresses only worker nodes.
Example: star takes up normally let’s say 5 GB for the sake of the example. then to achieve this and some extra slack do:
google {
lifeSciences.bootDiskSize = 1.GB
}
+
process {
withName: 'star' {
disk = "4 GB"
}
}
@sk-sahu @lmurba fyi what we are doing for this.
Problem
This job https://cloudos.lifebit.ai/app/jobs/5f248d56a79ea301123a7bc7 in the
jax-anczukow-lab
CloudOS workspace failed only because of running out of device space.Solution
Increase temporarily for the testing the disk space from the
conf/google.config
file.Implementation
How will we implement the solution?
1. Diagnose This happens when the results files from the working directories are being saved in the folder named
results
. As a conservative proxy of how much disk space you need, you can inspect the storage size accumulated in the working directory.In CloudOS, you can find the
nextflow run
command in the first line of the Nextflow log file which can be accessed by clicking onview log
.Now to diagnose how much storage space the workdir occupies, grab the gs
work
path from the first line of the log (it will be in the end of the line defined as-w gs:// ....
) and usegsutil
to summarize the storage that all files within thework
folder occupy:2. Attempt fix
Update this line in
conf/executors/google.config
https://github.com/lifebit-ai/splicing-pipelines-nf/blob/2866b361b5d8cd5f54bbf6c3846aa667607b77cb/conf/executors/google.config#L2Depending on how much terrabytes the output from the command in step 1. adjust the
lifeSciences.bootDiskSize
variable value.