mscross / pysplit

A package for HYSPLIT air parcel trajectory analysis.
BSD 3-Clause "New" or "Revised" License
149 stars 80 forks source link

ERA5 reanalysis ARL files #99

Open Iaminalake opened 1 year ago

Iaminalake commented 1 year ago

Does any one have any global ERA5 reanalysis data converted to ARL weekly or monthly files to share?

Or is there somewhere I can download them ready for use with pysplit?

After it taking several days to try to download 1 month, I ran out of space after downloading half a month. I noticed the converted ARL files are much smaller, so I would need to clear a lot less space. I eventually need at least a full year (any year between 2016 to 2021), so just wanted to ask if anyone would be willing to share their converted files :)

vwgeiser commented 9 months ago

@Iaminalake I had gotten into contact with some ARL staff about this issue this summer (I agree with you I think ERA5 data would be perfect for HYSPLIT/pysplit) and while there is a WRF to ARL script, to my knowledge, there isn't an ERA5 or general NetCDF to ARL conversion script yet. Although I will say this perhaps is a tool in development by ARL.

I still have been using GDAS and NAM data in my project so far but I too will be on the lookout for a community script or official ARL release.

vwgeiser commented 7 months ago

@Iaminalake I have found a solution to download and convert them yourself but it isn't so straightforward. https://github.com/amcz/hysplit_metdata I just became aware of this project recently and will be trying to run ERA5 files within pysplit soon. If you have found a better way let me know.

xli3111 commented 7 months ago

@Iaminalake I have found a solution to download and convert them yourself but it isn't so straightforward. https://github.com/amcz/hysplit_metdata I just became aware of this project recently and will be trying to run ERA5 files within pysplit soon. If you have found a better way let me know.

Do you make it through hysplit_metdata? I failed at the final stage, I cannot run the Pysplit through my combined weekly data (transformed from ERA5). If you make it, would you mind share some details? Hope for your reply, thanks.

amsmith1109 commented 6 months ago

@Iaminalake I have found a solution to download and convert them yourself but it isn't so straightforward. https://github.com/amcz/hysplit_metdata I just became aware of this project recently and will be trying to run ERA5 files within pysplit soon. If you have found a better way let me know.

Do you make it through hysplit_metdata? I failed at the final stage, I cannot run the Pysplit through my combined weekly data (transformed from ERA5). If you make it, would you mind share some details? Hope for your reply, thanks.

Are you trying to run generate_bulktraj() and what are the errors you are getting?

I've had a similar issue, and maybe you're getting the same issue, which is due to the filename. generate_bulktraj() feeds the meteo_dir input into _meteofinder(). At this time, the met data file discovery seems to be too specific. I got around the issue by changing generate_bulktraj() to be directly fed the list of met data filenames. Alternatively, you could try renaming your files to match the convention it's looking for that has monthyear*, somewhere in the name.

From the the comments

    For successful meteofinding, separate different meteorology types into
    different folders and name weekly or semi-monthly files according to the
    following convention:
        *mon*YY*#
    where the * represents a Bash wildcard.
vwgeiser commented 6 months ago

@Iaminalake I just recently was able to generate a couple of test trajectories using ERA5 data. Due to some space constraints with my own workflow my max file size was 5 days. Regardless, I was able to generate trajectories via generate_bulktraj() by matching the convention _meteofinder() is looking for.

For example my files look similar to this (p could be anything I believe): era5.sep.2017.27_30.p6 era5.oct.2017.1_5.p1 era5.oct.2017.6_10.p2 era5.oct.2017.11-16.p3 era5.oct.2017.17_22.p4 era5.oct.2017.23_27.p5 era5.oct.2017.28_31.p6

With meteo_bookends = ([5,6],[1])

However, it is also a good idea to run through the HSYPLIT GUI "check file" (menu->meteorology->display data->check file) procedure just to ensure there are no problems with the converted meteorology data itself (I discovered problems with my data when doing so).

Although I imagine the solution from amsmith1109 would work as well. @amsmith1109 do you remember the specific "filename" error you encountered while running through your workflow? I've noticed that I sill sometimes run into a combo of FileNotFoundError: [WinError 2] The system cannot find the file specified and FileNotFoundError: [Errno 2] No such file or directory upon generating select trajectories. Did implementing your change to generate_bulktraj() fix this?

vwgeiser commented 5 months ago

For others who find this thread, MeteoInfoLab's [Convert GRIB data to ARL data] http://meteothink.org/examples/meteoinfolab/trajectory/grib2arl.html) also is another option to explore for converting ERA5 .grib files to ARL format.

Make sure to provide the script with both an ERA5 surface (single level) and pressure level .grib file over the same time period. Also look ahead to see what variables should be contained within each file. Using the script from MeteoInfo worked with no modifications (beyond changing directories).

Additionally, as an extra more general note, make sure the .grib data you are trying to convert is temporally consistent over the period you are converting. I had originally tried to grab a few less days worth of data by skipping over some days that would have been unnecessary for my project (I needed July 10th and 12th but not July 11th therefore I skipped it in my CDS API request), the MeteoInfo script still runs as normal in this converts to a final ARL file, but running the HYSPLIT check file procedure produces a number "23" (if I recall correctly) error that is related to an inconsistent timestep within a converted ARL file.