EcoExtreML / Emulator

Apache License 2.0
0 stars 1 forks source link

global input data download #1

Open QianqianHan96 opened 1 year ago

QianqianHan96 commented 1 year ago

Hi, Sarah

Do you know how to check the available project storage space on snellius? I only can see the size of the whole work1 folder, but can not see the available space for einf2480. I can see how much we used. The total size of ERA5-Land for 20 years will be around 18TB. Maybe better to download 5 years and test on 5 cores, how do you think? @SarahAlidoost image image

SarahAlidoost commented 1 year ago

Hi, Sarah

Do you know how to check the available project storage space on snellius?

project space is 20 TB. about 52% of this is already used. So there is about 10 TB of space left. I can remove some directories to free up space. But this won't be enough. Can you run your model for a few years of input data and copy your data to CRIB or other places?

To get limits and current usage of the relevant disk, you can use myquota command. for example the command prjspc-quota /projects/0/einf2480/ give you qouta on project space.

SarahAlidoost commented 1 year ago

@QianqianHan96 I am not sure if you are familiar with Snellius usage and accounting. When submitting a job script, the project is charged for SBU. For one hour of usage of a full thin node, the SBU is 128, see acounting.

You can get the SBU information by command accinfo. For example, now, the SBU information for ecoextreml project is

Initial budget       : 100000:00
Used budget          : 60009:24
Remaining budget     : 39990:35. 

The minimum amount of SBU that can be set in a job script is 32. It means that even if your job is not using 32 cores, the project will be charged for 32 SBU. So, it is important to use the resources efficiently.

SarahAlidoost commented 1 year ago

@Crystal-szj the information in this issue might be helpful for you too.

QianqianHan96 commented 1 year ago

Hi, Sarah Do you know how to check the available project storage space on snellius?

project space is 20 TB. about 52% of this is already used. So there is about 10 TB of space left. I can remove some directories to free up space. But this won't be enough. Can you run your model for a few years of input data and copy your data to CRIB or other places?

To get limits and current usage of the relevant disk, you can use myquota command. for example the command prjspc-quota /projects/0/einf2480/ give you qouta on project space.

Thanks for your information, Sarah. "global_data_Qianqian" is my directory. I need 18 TB for ERA5-Land data (20 years), and I have other variables maybe also several TB, but the result will be around 20TB too. How do you think for now we run 5 years on snellius for the parallel computing scaling up, now I am running for 1 computation block? Then, I copy the input and output to other places, and continue running other 15 years?

QianqianHan96 commented 1 year ago

@QianqianHan96 I am not sure if you are familiar with Snellius usage and accounting. When submitting a job script, the project is charged for SBU. For one hour of usage of a full thin node, the SBU is 128, see acounting.

You can get the SBU information by command accinfo. For example, now, the SBU information for ecoextreml project is

Initial budget       : 100000:00
Used budget          : 60009:24
Remaining budget     : 39990:35. 

The minimum amount of SBU that can be set in a job script is 32. It means that even if your job is not using 32 cores, the project will be charged for 32 SBU. So, it is important to use the resources efficiently.

Thanks for your reminding, I saw this information on snellius website. I will be careful with using it.

SarahAlidoost commented 1 year ago

Hi, Sarah Do you know how to check the available project storage space on snellius?

project space is 20 TB. about 52% of this is already used. So there is about 10 TB of space left. I can remove some directories to free up space. But this won't be enough. Can you run your model for a few years of input data and copy your data to CRIB or other places? To get limits and current usage of the relevant disk, you can use myquota command. for example the command prjspc-quota /projects/0/einf2480/ give you qouta on project space.

Thanks for your information, Sarah. "global_data_Qianqian" is my directory. I need 18 TB for ERA5-Land data (20 years), and I have other variables maybe also several TB, but the result will be around 20TB too. How do you think for now we run 5 years on snellius for the parallel computing scaling up, now I am running for 1 computation block?

it is better first to estimate how much space is needed for all input/output of 1 year. Then check the memory usage for 1 computing block. The parallel variables can be the number of years and the number of spatial units.

QianqianHan96 commented 1 year ago

Hi, Sarah Do you know how to check the available project storage space on snellius?

project space is 20 TB. about 52% of this is already used. So there is about 10 TB of space left. I can remove some directories to free up space. But this won't be enough. Can you run your model for a few years of input data and copy your data to CRIB or other places? To get limits and current usage of the relevant disk, you can use myquota command. for example the command prjspc-quota /projects/0/einf2480/ give you qouta on project space.

Thanks for your information, Sarah. "global_data_Qianqian" is my directory. I need 18 TB for ERA5-Land data (20 years), and I have other variables maybe also several TB, but the result will be around 20TB too. How do you think for now we run 5 years on snellius for the parallel computing scaling up, now I am running for 1 computation block?

it is better first to estimate how much space is needed for all input/output of 1 year. Then check the memory usage for 1 computing block. The parallel variables can be the number of years and the number of spatial units.

All input/output of 1 year at global scale is around 2.5-3 TB. I will let you know the running time and memory usage when the 1 computing block finish. Now it has been running for 17 hours.