meteoinfo / MeteoInfo

MeteoInfo: GIS, scientific computation and visualization environment.
http://www.meteothink.org/
GNU Lesser General Public License v3.0
311 stars 93 forks source link

Python script for converting grib files to ARL #41

Closed Msturroc closed 1 week ago

Msturroc commented 2 weeks ago

Hi Yaqiang, excellent software! Tried posting this on gitter but I don't think you or anyone checks that anymore. I (like others before) have come here due to your python script for converting grib files to ARL for use with hysplit. I seem to be able to run the script fine (after changing some variable names to be associated with my particular ERA5 grib files) but when I try to view the outputted ARL Files in meteoinfomap it hits me with the error: java.lang.NumberFormatException: For input string: "���"

If you have time to look, my code is

# Convert ERA5 GRIB data to ARL data

#---- Set data folder
datadir = r'/home/marc/Downloads/hysplit.v5.3.0_UbuntuOS20.04.6LTS/examples/'
#---- Set output data file
outfn = os.path.join(datadir, 'test_era5_grib_diff.arl')
#---- Read a GRIB data file
infn3d = addfile('/home/marc/Downloads/hysplit.v5.3.0_UbuntuOS20.04.6LTS/examples/ERA5_1965.Jul15.3dpl.grib'.format(datadir))
infn2d = addfile('/home/marc/Downloads/hysplit.v5.3.0_UbuntuOS20.04.6LTS/examples/ERA5_1965.Jul15.2dpl.all.grib'.format(datadir))

print 'GRIB data file has been opened...'
#---- Set output ARL data file
arlf = addfile(outfn, 'c', dtype='arl')

#---- Set variable and level list
#---- Variable names in ERA5 data file
gvar2d = ['Geopotential_surface','Surface_pressure_surface','Surface_latent_heat_flux_surface_1_Hour_Accumulation','2_metre_temperature_surface','10_metre_U_wind_component_surface',\
    '10_metre_V_wind_component_surface','Mean_sea_level_pressure_surface','Boundary_layer_height_surface','Convective_available_potential_energy_surface','Surface_solar_radiation_downwards_surface_1_Hour_Accumulation',\
    'Total_precipitation_surface_1_Hour_Accumulation','Instantaneous_surface_sensible_heat_flux_surface']

gvar3d = ['Geopotential_isobaric','Temperature_isobaric','Vertical_velocity_isobaric',\
    'U_component_of_wind_isobaric','V_component_of_wind_isobaric','Relative_humidity_isobaric']

#---- Corresponding variable names in ARL data file
avar2d = ['HGTS','PRSS', 'LTHF', 'T02M', 'U10M', 'V10M', 'MSLP', 'PBLH', 'CAPE', 'DSWF', 'TPP1', 'SHTF']
avar3d = ['HGTS','TEMP','WWND','UWND','VWND','RELH']
#--- Add DIFF fields - difference between the original data and the packed data
avar3d_diff = list(avar3d)
avar3d_diff.append('DIFW')

#---- Set parameters of ARL data file
gv = infn3d['Geopotential_isobaric']
nx = gv.dimlen(gv.ndim - 1)
ny = gv.dimlen(gv.ndim - 2)
levels = gv.dimvalue(gv.ndim - 3)[::-1]
nz = len(levels)
arlf.setlevels(levels)
arlf.set2dvar(avar2d)
for l in levels:
    arlf.set3dvar(avar3d_diff)
arlf.setx(gv.dimvalue(gv.ndim - 1))
arlf.sety(gv.dimvalue(gv.ndim - 2))

#---- Write ARL data file
tNum = infn3d.timenum()
fhour = 0
for t in range(0, tNum):
    print('Time index: ' + str(t))
    atime = infn3d.gettime(t)
    print(atime.strftime('%Y-%m-%d %H:00'))
    dhead = arlf.getdatahead(infn3d.proj, 'RSMC', 2, fhour)
    #Pre-write index record without checksum - will be over-write latter
    arlf.writeindexrec(atime, dhead)
    #Checksum list
    ksumlist = [[] for _ in range(nz + 1)]  # Initialize with correct structure
    # Write 2d variables
    for avname,gvname in zip(avar2d, gvar2d):
        gdata = infn2d[gvname][t,:,:]
        if avname == 'PRSS':
            gdata = gdata * 0.01
        ksum = arlf.writedatarec(atime, 0, avname, fhour, 99, gdata)
        ksumlist[0].append(ksum)
    # Write 3d variables
    for lidx in range(0, nz):
        llidx = nz - lidx - 1
        print(lidx,llidx)        
        for avname,gvname in zip(avar3d, gvar3d):
            gdata = infn3d[gvname][t,llidx,:,:]            
            if avname == 'WWND':
                gdata = gdata * 0.01
                difw = arlf.diff_origin_pack(gdata)
            elif avname == 'SPHU':
                gdata = gdata * 1000.
            elif avname == 'HGTS':
                gdata = gdata / 9.80665
            ksum = arlf.writedatarec(atime, lidx + 1, avname, fhour, 99, gdata)
            ksumlist[lidx + 1].append(ksum)
        ksum = arlf.writedatarec(atime, lidx + 1, 'DIFW', fhour, 99, difw)
        ksumlist[lidx + 1].append(ksum)
    #Re-write index record with checksum
    arlf.writeindexrec(atime, dhead, ksumlist)
    fhour += 1
arlf.close()
print('Finished!')

Would be interested to develop the python script further to generalise it to work with any particular ERA5 grib file. Any tips for getting this to work?

Best wishes, Marc

Yaqiang commented 2 weeks ago

Is it possible for you to provide an example ERA5 grib data file and the conversion script, so I can reproduce the problem and have chance to fix it.

Msturroc commented 2 weeks ago

Thanks @Yaqiang! The files should be available here (let me know if you can access them): https://www.dropbox.com/scl/fo/hoeu9e758xf7c43c7qzpy/APYSc8Ef6Dt8-iPGJ3VSg9E?rlkey=7iggho1x8b76x8vd7x06uw63e&st=qkns1do8&dl=0

My code is below. It does convert but the resultant ARL file doesn't seem to be readable...


#---- Set data folder
datadir = '../hysplit.v5.3.0_UbuntuOS20.04.6LTS/examples/'
#---- Set output data file
outfn = os.path.join(datadir, 'test_era5_grib_diff.arl')
#---- Read a GRIB data file
infn3d = addfile('../hysplit.v5.3.0_UbuntuOS20.04.6LTS/examples/ERA5_1965.Jul15.3dpl.grib'.format(datadir))
infn2d = addfile('../hysplit.v5.3.0_UbuntuOS20.04.6LTS/examples/ERA5_1965.Jul15.2dpl.all.grib'.format(datadir))

print 'GRIB data file has been opened...'
#---- Set output ARL data file
arlf = addfile(outfn, 'c', dtype='arl')

#---- Set variable and level list
#---- Variable names in ERA5 data file
gvar2d = ['Geopotential_surface','Surface_pressure_surface','Surface_latent_heat_flux_surface_1_Hour_Accumulation','2_metre_temperature_surface','10_metre_U_wind_component_surface','10_metre_V_wind_component_surface','Mean_sea_level_pressure_surface','Boundary_layer_height_surface','Convective_available_potential_energy_surface','Surface_solar_radiation_downwards_surface_1_Hour_Accumulation','Total_precipitation_surface_1_Hour_Accumulation','Instantaneous_surface_sensible_heat_flux_surface']
#gvar2d = ['Surface_pressure_surface','2_metre_temperature_surface','10_metre_U_wind_component_surface',\
#    '10_metre_V_wind_component_surface','Boundary_layer_height_surface']

gvar3d = ['Geopotential_isobaric','Temperature_isobaric','U_component_of_wind_isobaric','V_component_of_wind_isobaric','Vertical_velocity_isobaric','Relative_humidity_isobaric']

#---- Corresponding variable names in ARL data file
avar2d = ['SHGT','PRSS', 'LTHF', 'T02M', 'U10M', 'V10M', 'MSLP', 'PBLH', 'CAPE', 'DSWF', 'TPP1', 'SHTF']
#avar2d = ['PRSS','T02M','U10M','V10M','PBLH']
avar3d = ['HGTS','TEMP','UWND','VWND','WWND','RELH']

#--- Add DIFF fields - difference between the original data and the packed data
avar3d_diff = list(avar3d)
avar3d_diff.append('DIFW')

#---- Set parameters of ARL data file
gv = infn3d['Geopotential_isobaric']
nx = gv.dimlen(gv.ndim - 1)
ny = gv.dimlen(gv.ndim - 2)
levels = gv.dimvalue(gv.ndim - 3)[::-1]
nz = len(levels)
arlf.setlevels(levels)
arlf.set2dvar(avar2d)
for l in levels:
    arlf.set3dvar(avar3d_diff)
arlf.setx(gv.dimvalue(gv.ndim - 1))
arlf.sety(gv.dimvalue(gv.ndim - 2))

#---- Write ARL data file
tNum = infn3d.timenum()
fhour = 0
for t in range(0, tNum):
    print('Time index: ' + str(t))
    atime = infn3d.gettime(t)
    print(atime.strftime('%Y-%m-%d %H:00'))
    dhead = arlf.getdatahead(infn3d.proj, 'RSMC', 2, fhour)
    #Pre-write index record without checksum - will be over-write latter
    arlf.writeindexrec(atime, dhead)
    #Checksum list
    ksumlist = [[] for _ in range(nz + 1)]  # Initialize with correct structure
    # Write 2d variables
    for avname,gvname in zip(avar2d, gvar2d):
        gdata = infn2d[gvname][t,:,:]
        if avname == 'PRSS':
            gdata = gdata * 0.01
        ksum = arlf.writedatarec(atime, 0, avname, fhour, 99, gdata)
        ksumlist[0].append(ksum)
    # Write 3d variables
    for lidx in range(0, nz):
        llidx = nz - lidx - 1
        print(lidx,llidx)        
        for avname,gvname in zip(avar3d, gvar3d):
            gdata = infn3d[gvname][t,llidx,:,:]            
            if avname == 'WWND':
                gdata = gdata * 0.01
                difw = arlf.diff_origin_pack(gdata)
            elif avname == 'SPHU':
                gdata = gdata * 1000.
            elif avname == 'HGTS':
                gdata = gdata / 9.80665
            ksum = arlf.writedatarec(atime, lidx + 1, avname, fhour, 99, gdata)
            ksumlist[lidx + 1].append(ksum)
        ksum = arlf.writedatarec(atime, lidx + 1, 'DIFW', fhour, 99, difw)
        ksumlist[lidx + 1].append(ksum)
    #Re-write index record with checksum
    arlf.writeindexrec(atime, dhead, ksumlist)
    fhour += 1
arlf.close()
print('Finished!')
Yaqiang commented 2 weeks ago

I have tried your scirpt dand data files. The ARL data can be converted and can be opened in MeteoInfoMap and MeteoInfoLab. I can not reproduce your problem in my side. You may try the newest version of MeteoInfo if the version you used is out of date. 1724994083269

Msturroc commented 2 weeks ago

interesting! Yes, I am using MeteoInfoLab 3.8.11. Will try updating and report back! On 3.8.11 i'm getting the following error trying to add the generated file:

Traceback (most recent call last):
  File "/home/marc/Downloads/MeteoInfo/pylib/test_convert.py", line 88, in <module>
    f = addfile(fn)
  File "/home/marc/Downloads/MeteoInfo/pylib/mipylib/dataset/midata.py", line 120, in addfile
    meteodata.openData(fname, keepopen)
    at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.base/java.lang.Integer.parseInt(Integer.java:652)
    at java.base/java.lang.Integer.parseInt(Integer.java:770)
    at org.meteoinfo.data.meteodata.arl.ARLDataInfo.readDataHead(ARLDataInfo.java:594)
    at org.meteoinfo.data.meteodata.arl.ARLDataInfo.readDataInfo(ARLDataInfo.java:437)
    at org.meteoinfo.data.meteodata.MeteoDataInfo.openData(MeteoDataInfo.java:524)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
java.lang.NumberFormatException: java.lang.NumberFormatException: For input string: "???"
Msturroc commented 2 weeks ago

hmm, even with 3.9.3, i'm getting the same error (though the outputted ARL file size is different - 432.1 mb compared to ~ 330mb before, so something has changed certainly).

Finished!
Traceback (most recent call last):
  File "/home/marc/Downloads/MeteoInfo/test_conversion.py", line 88, in <module>
    f = addfile(fn)
  File "/home/marc/Downloads/MeteoInfo/pylib/mipylib/dataset/midata.py", line 120, in addfile
    meteodata.openData(fname, keepopen)
    at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.base/java.lang.Integer.parseInt(Integer.java:652)
    at java.base/java.lang.Integer.parseInt(Integer.java:770)
    at org.meteoinfo.data.meteodata.arl.ARLDataInfo.readDataHead(ARLDataInfo.java:594)
    at org.meteoinfo.data.meteodata.arl.ARLDataInfo.readDataInfo(ARLDataInfo.java:437)
    at org.meteoinfo.data.meteodata.MeteoDataInfo.openData(MeteoDataInfo.java:524)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
java.lang.NumberFormatException: java.lang.NumberFormatException: For input string: "???"

I also tried loading the ARL file into hysplit, and got this error:

*ERROR* metset: 2nd time period INDX record missing

I want to add that other ARL files load fine using addfile (just not this one I have converted) and i'm using Linux.

Yaqiang commented 2 weeks ago

It's quite interesting. What is your computer system and language? I am thinking maybe the source is the comma decimal point in some locale language. If that's the case, you can try to change the locale language to English and run the script again.

Yaqiang commented 2 weeks ago

I am using Windows 11 computer. The data convert script and result ARL data file were loaded here: https://www.dropbox.com/scl/fi/r0qavj98blm7d9lmfcfk9/test_era5_grib_diff_1.arl?rlkey=30vxxcvxfydnfhj1jmivvvluj&st=shj4i63o&dl=0 , https://www.dropbox.com/scl/fi/9j6kd1vampis4fwqaiuto/grib2arl_test.py?rlkey=r629760wp0f62njlrbmgpp9xn&st=rqahr1sx&dl=0

Msturroc commented 2 weeks ago

Here is some additional info about my computer system and language, I am using English.

image

Msturroc commented 2 weeks ago

I will try using windows 11 also, thanks! I will report back

Msturroc commented 1 week ago

I encountered issues on my Linux machine when running scripts that worked fine on my Windows 11 machine. Upon investigation, I found that the Linux machine was using an older Java version (OpenJDK 11.0.24) by checking the Java version with the java -version command. Realising the Java version was outdated, I updated to the latest Java version (Java 22.0.2) using SDKMAN!. After the update, the scripts now run as expected, and the discrepancies between the Linux and Windows environments have been resolved. Apologies for that and thanks again for your software/patience!