EUREC4A-UK / lagtraj

Python trajectory code for Lagrangian simulations
MIT License
11 stars 9 forks source link

How to download only the skin temperature variable to update existing ERA5 data files #180

Closed xychen-ocn closed 1 year ago

xychen-ocn commented 1 year ago

Hi @leifdenby and @sjboeing, 🎉 Thank you for bumping LagTraj to v0.1.1! 🎉

I am getting back to the LES world and SAM. After syncing my repo to v0.1.1, I realized that the ERA5 netCDF files I originally downloaded do not contain the skin temperature variable (skt) introduced in #168. Is there a way to download only the skt variable so that I can manually update all data files I already had? I am asking because it will take me a while to re-download almost 2 months of data from CDS again, so I wonder if you have any temporary solution for users like me... :sweat_smile:

sjboeing commented 1 year ago

@xychen-ocn: great to see the new work on SAM inputs. I am not sure if only downloading the SST would be quicker, due to the way the ECMWF data is retrieved from disk (assuming getting the data from the tape archive rather than downloading is the bottleneck). In principle, I think it should be possible to modify the download script in a temporary branch to only download skin temperature, and then merge the files, but it may not be worth the effort.

The other option would be to use a different surface temperature (temporarily, reverting/modifying some of our recent changes). We changed this recently for better agreement with how the other groups run EUREC4A cases, but you are right that downloading the updated files takes a while.

leifdenby commented 1 year ago

Is there a way to download only the skt variable so that I can manually update all data files I already had?

Sorry @xychen-ocn, I thought this might be an issue. Unfortunately, I haven't had time to write functionality in lagtraj that would just download any variables not already present. You could do as @sjboeing suggests and modify lagtraj yourself to 1) only queue and download the additional variables and 2) read in ERA5 data where some variables are split in separate files. You need to modify 1) https://github.com/EUREC4A-UK/lagtraj/blob/master/lagtraj/domain/sources/era5/download.py#L351 and 2) https://github.com/EUREC4A-UK/lagtraj/blob/master/lagtraj/domain/sources/era5/load.py#L42 to achieve that.

To be honest I think I would start afresh and queue and download the data again :) I appreciate that's really frustrating, but anything based on your existing files is bound to be quite brittle I think. We could consider splitting our downloads into separate files (or even having an intermediate representation with the different variables in separate files), but that doesn't fix your problem now.

sjboeing commented 1 year ago

@xychen-ocn, @leifdenby: are we OK to close this for now?

xychen-ocn commented 1 year ago

I am good and have downloaded everything afresh. Thank you!