tomaloki / 2024_msc_tonje_metar

1 stars 0 forks source link

Data preparation #9

Open tomaloki opened 7 months ago

tomaloki commented 7 months ago

Thoughts and questions

Note

tomaloki commented 7 months ago

Struggling with ArrayMemoryError. Does not occur every time, but not able to extract big sets of data at this moment. Working if I choose all the days within a specific month folder.

tomaloki commented 7 months ago
ArrayMemoryError: Unable to allocate 60.9 MiB for an array with shape (6, 1, 1488, 1788) and data type float32
tomaloki commented 7 months ago

For some reason, I also get an issue when trying to run the monthly extraction directly in the queue as a script through the terminal - but now when I do it in jupyter. Will try with the code for year extraction and see.

jerabaul29 commented 7 months ago

This looks like you are going empty for RAM? Do you have enough RAM available? If you have too little RAM, you can try to go to a bigmem node and ask for 100GB RAM. And / or, if your RAM consumption is too large, you can try to reduce it by optimizing your code.

An array 6 x 1488 x 1788 in float64 is relatively big, if you have many of these that may explain.

tomaloki commented 7 months ago

Update I have managed to extract _air_temperature0m and _air_temperature2m from all relevant files from 2023. Ensemble member 0 has been chosen, and only the 6 first timesteps from each file. All timesteps from each month is saved as a new netcdf file.

I will extract from 2021 and 2022, and then try to combine all the data. I will also try to extract values from airport locations and add metar messages in the datasets.

jerabaul29 commented 7 months ago

Excellent :) . Did using the bigmem node solve the problem, or did you end up splitting up the task? :)