Unidata / netcdf4-python

netcdf4-python: python/numpy interface to the netCDF C library
http://unidata.github.io/netcdf4-python
MIT License
755 stars 264 forks source link

converting_HDF4 to netCDF-Classic/netCDF-4 #696

Open kunalbali opened 7 years ago

kunalbali commented 7 years ago

I have one file of MODIS-Terra Level-2 MOD11_L2.A2003090.0450.006.2015187194709.hdf This file related to the land surface temperature (LST).

I need to convert this file into netCDF using python script with NCL. But it is not converting and giving the following errors Classic mode NetCDf does not support unsigned integer types: LST will be written as type short Classic mode NetCDf does not support unsigned integer types: QC will be written as type short Classic mode NetCDf does not support unsigned integer types: Error_LST will be written as type byte Classic mode NetCDf does not support unsigned integer types: Emis_31 will be written as type byte Classic mode NetCDf does not support unsigned integer types: Emis_32 will be written as type byte Classic mode NetCDf does not support unsigned integer types: View_angle will be written as type byte Classic mode NetCDf does not support unsigned integer types: View_time will be written as type byte Traceback (most recent call last): File "/home/kunal/mishra_sir/test_data.py", line 88, in lonin = ncfile.variables['Longitude'][:] KeyError: 'Longitude'

But when I am running the same script for another MODIS-Terra L2 file (aerosol and cloud product), then it is easily converting and extracting variables in netcdf format.

The LST file data is in 2D and the aerosol and cloud file data all in Geo2D. So, could you please let me know how to convert 2D to Geo2D in this script.
For converting and extracting variable from hdf to netcdf I am using

python test.py filename variable name ./ ./ The python script is given below

!/usr/bin/python

-- coding: utf-8 --

from pylab import * import numpy as np import scipy.io from netCDF4 import Dataset, num2date, date2num from datetime import datetime, timedelta, date import subprocess import sys

def substring_after(s, delim): return s.partition(delim)[2]

def make_bounds2D(datain):

calculates the netcdf bounds

# supports 2-d
#
# lat are calculates like this
#
#       3
#
#   0   X   2
#
#       1
#
# Refe: http://cfconventions.org/cf-conventions/cf-conventions.html#cell-boundaries
#
dataout = zeros([np.shape(datain)[0],np.shape(datain)[1],4])
dataout.fill(np.nan) # output array
for ii in range(np.shape(datain)[0]):
    for jj in range(np.shape(datain)[1]):
        if ii == np.shape(datain)[0]-1:
            if jj == np.shape(datain)[1]-1:
                dataout[ii,jj,0] = datain[ii,jj] - (datain[ii,jj]-datain[ii,jj-1])/2.
                dataout[ii,jj,1] = datain[ii,jj] - (datain[ii,jj]-datain[ii-1,jj])/2.
                dataout[ii,jj,2] = datain[ii,jj] + (datain[ii,jj]-datain[ii,jj-1])/2.
                dataout[ii,jj,3] = datain[ii,jj] + (datain[ii,jj]-datain[ii-1,jj])/2.
            else:
                dataout[ii,jj,0] = datain[ii,jj] - (datain[ii,jj+1]-datain[ii,jj])/2.
                dataout[ii,jj,1] = datain[ii,jj] - (datain[ii,jj]-datain[ii-1,jj])/2.
                dataout[ii,jj,2] = datain[ii,jj] + (datain[ii,jj+1]-datain[ii,jj])/2.
                dataout[ii,jj,3] = datain[ii,jj] + (datain[ii,jj]-datain[ii-1,jj])/2.
        else:
            if jj == np.shape(datain)[1]-1:
                dataout[ii,jj,0] = datain[ii,jj] - (datain[ii,jj]-datain[ii,jj-1])/2.
                dataout[ii,jj,1] = datain[ii,jj] - (datain[ii+1,jj]-datain[ii,jj])/2.
                dataout[ii,jj,2] = datain[ii,jj] + (datain[ii,jj]-datain[ii,jj-1])/2.
                dataout[ii,jj,3] = datain[ii,jj] + (datain[ii+1,jj]-datain[ii,jj])/2.
            else:
                dataout[ii,jj,0] = datain[ii,jj] - (datain[ii,jj+1]-datain[ii,jj])/2.
                dataout[ii,jj,1] = datain[ii,jj] - (datain[ii+1,jj]-datain[ii,jj])/2.
                dataout[ii,jj,2] = datain[ii,jj] + (datain[ii,jj+1]-datain[ii,jj])/2.
                dataout[ii,jj,3] = datain[ii,jj] + (datain[ii+1,jj]-datain[ii,jj])/2.

return dataout

filename = sys.argv[1] variable = sys.argv[2] hdfpath = sys.argv[3] netcdfpath = sys.argv[4]

get date information from filename

year = np.int(substring_after(filename,'.A')[0:4]) dayofyear = np.int(substring_after(filename,'.A')[4:7]) hour = np.int(substring_after(filename,'.A')[8:10]) minute = np.int(substring_after(filename,'.A')[10:12])

filedate = date.fromordinal(date(year,1,1).toordinal() + np.int(filename[14:17]) - 1)

how to use e.g.:

python test_data.py MOD04_L2.A2012313.0545.006.2015064062534 Deep_Blue_Aerosol_Optical_Depth_550_Land inputpath outputpath

first create a netcdf file using ncl

subprocess.call('ncl_convert2nc '+filename+'.hdf -i '+\ hdfpath+' -o '+netcdfpath,shell=True)

rename the file for temp use

subprocess.call('mv '+netcdfpath+filename+'.nc '+\ netcdfpath+filename+'_temp.nc',shell=True)

Open the file for reading

ncfile = Dataset(netcdfpath+filename+'_temp.nc','r')

read in the longitudes

lonin = ncfile.variables['Longitude'][:]

read in the latitudes

latin = ncfile.variables['Latitude'][:]

read in the actual variable

varin = ncfile.variables[variable][:]

Close the file

ncfile.close()

remove temp file

subprocess.call('rm '+netcdfpath+filename+'_temp.nc',shell=True)

Calculate bounds

lonbmap = make_bounds2D(lonin) latbmap = make_bounds2D(latin)

Open the file for writing

ncout = Dataset(netcdfpath+filename+'.nc', 'w', format="netCDF4")

create dimensions

time = ncout.createDimension("time", 1) # only one time step per file lon = ncout.createDimension("lon", np.shape(varin)[1]) lat = ncout.createDimension("lat", np.shape(varin)[0]) nv = ncout.createDimension("nv", 4) # for grid corners

create coordinate variables

times = ncout.createVariable("time","f8",("time",)) latitudes = ncout.createVariable("latitude","f8",("lat","lon",)) longitudes = ncout.createVariable("longitude","f8",("lat","lon",)) lat_bnds = ncout.createVariable("lat_bnds","f8",("lat","lon","nv",)) lon_bnds = ncout.createVariable("lon_bnds","f8",("lat","lon","nv",))

actual variable

varout = ncout.createVariable(variable,"f8",("time","lat","lon",))

create unit and attributes for all variables

ncout.description = "Read MODIS Terra data" ncout.source = "NPL-Kunal Bali"

times.units = "hours since 0001-01-01 00:00:00.0" times.calendar = "gregorian"

latitudes.units = "degrees_north" longitudes.units = "degrees_east" latitudes.bounds = "lat_bnds" longitudes.bounds = "lon_bnds"

varout.units = "" varout.long_name = "LST"

varout.coordinates = "latitude longitude" varout.fillvalue = "-9.e+33" varout.missing_value = "-9.e+33"

write data to variables

dates = [datetime(filedate.year,filedate.month,filedate.day,hour,minute)] times[:] = date2num(dates,units=times.units,calendar=times.calendar)

longitudes [:,:] = lonin [:,:] latitudes [:,:] = latin [:,:] lon_bnds [:,:,:] = lonbmap [:,:,:] lat_bnds [:,:,:] = latbmap [:,:,:]

actual variable

varout[0,:,:] = varin[:,:]

Close the file

ncout.close()

jswhit commented 7 years ago

I don't have time to read through all your code to try to figure out what you are asking. Can you please distill this down to a simple question and/or problem?

Note that unsigned integer types are not allowed in the classic netcdf format (NETCDF3*, and NETCDF4_CLASSIC). You need to use NETCDF4 when using unsigned integer types.

kunalbali commented 7 years ago

I also did that. I changed NETCDF3 TO NETCDF4. But I was still getting the same errors. Classic mode NetCDf does not support unsigned integer types: LST will be written as type short

I have one python script for converting HDF4 to netcdf and then extracting the desired variable from the file. It is running well with MODIS-Terra L2 HDF4 for aerosols and clouds datasets. But giving errors for land surface temp data. One problem is mentioned above. 2nd one is not reading longitude and latitude variable and showing File "/home/kunal/mishra_sir/test_data.py", line 88, in lonin = ncfile.variables['longitude'][:] KeyError: 'longitude'

The temperature data is in 2D and aerosols and clouds data are in Geo2D.

jswhit commented 7 years ago

Where does the error message "Classic mode NetCDf does not support unsigned integer types: LST will be written as type shor" come from? It is being emitted when you read the HDF data or when you attempt to write the netcdf data?

lonin = ncfile.variables['longitude'][:] KeyError: 'longitude'

means that the variable 'longitude' does not exist.

kunalbali commented 7 years ago

I am getting these error messeges while writing the netcdf data.

And the file having the Longitude variable. float Latitude(Coarse_swath_lines_5km=406, Coarse_swath_pixels_5km=271); :long_name = "Latitude of every 5 scan lines and 5 pixels"; :units = "degrees_north"; float Longitude(Coarse_swath_lines_5km=406, Coarse_swath_pixels_5km=271); :long_name = "Longitude of every 5 scan lines and 5 pixels"; :units = "degrees_east";

but still showing lonin = ncfile.variables['Longitude'][:] KeyError: 'Longitude'