barronh / pseudonetcdf

PseudoNetCDF like NetCDF except for many scientific format backends
GNU Lesser General Public License v3.0
76 stars 35 forks source link

Unable to load Cloud Rain & Snow CAMX files #75

Closed ZhongSiming closed 4 years ago

ZhongSiming commented 4 years ago

First, thanks for writing this package. I'm working on a project that would have taken much longer to get started if this package didn't exist. I'm using pseudonetcdf v3.1.0 with python v3.6.7

I'm attaching the 2 file types I'm having issues with. They are both single day input files. One is for cloud and rain data and the other is snow.

Here's the code I'm using, all of the other input files use the same number of rows and columns. Even without specifying the number of rows and columns, I get the same errors.

crpath = 'Input/1990/meteorology/camx.cr.19900115'
cr = pnc.pncopen(crpath, format = 'camxfiles.cloud_rain.Memmap.cloud_rain', rows = 82, cols = 132)
print(cr)

The error for this is:

Traceback (most recent call last):

  File "<ipython-input-9-b9f15b943b9c>", line 1, in <module>
    cr = pnc.pncopen(crpath, format = 'camxfiles.cloud_rain.Memmap.cloud_rain', rows = 82, cols = 132)

  File "Anaconda3\envs\Python-GPU\lib\site-packages\PseudoNetCDF\_getreader.py", line 153, in pncopen  
   outfile = reader(*args, **kwds)

  File "Anaconda3\envs\Python-GPU\lib\site-packages\PseudoNetCDF\camxfiles\cloud_rain\Memmap.py", line 106, in __init__
    self.SDATE, self.STIME = self.variables['TFLAG'][0, 0, :]

  File "Anaconda3\envs\Python-GPU\lib\site-packages\PseudoNetCDF\core\_files.py", line 2629, in __missing__
    return self.__func(k)

  File "Anaconda3\envs\Python-GPU\lib\site-packages\PseudoNetCDF\camxfiles\cloud_rain\Memmap.py", line 138, in __var_get
    (rows * cols + 2) + 4)[:, 1] = hour
snpath = 'E:/PMCAMx/Input/1990/meteorology/camx.sn.19900115'
= 'Input/1990/meteorology/camx.sn.19900115'
sn = pnc.pncopen(snpath, format = 'cloud_rain', rows = 82, cols = 132)

The error for this one is:

Traceback (most recent call last):

  File "<ipython-input-11-af24310e8869>", line 1, in <module>
    sn = pnc.pncopen(snpath, format = 'cloud_rain', rows = 82, cols = 132)

  File "Anaconda3\envs\Python-GPU\lib\site-packages\PseudoNetCDF\_getreader.py", line 153, in pncopen
    outfile = reader(*args, **kwds)

  File "Anaconda3\envs\Python-GPU\lib\site-packages\PseudoNetCDF\camxfiles\cloud_rain\Memmap.py", line 69, in __init__
    self.__memmap = memmap(rf, dtype='>f', mode='r', offset=offset)

  File "Anaconda3\envs\Python-GPU\lib\site-packages\numpy\core\memmap.py", line 236, in __new__
    raise ValueError("Size of available data is not a "

ValueError: Size of available data is not a multiple of the data-type size.

Any help would be appreciated! If you need any information, please let me know! Cheers, Sam

cloudrainsnow.zip

barronh commented 4 years ago

Thank you for posting a test case. The snow file is a text file and is not meant to be readable by the cloud_rain reader. The cloud file is more interesting, but I have figured it out. Your cloud file seems to be designed for pre-version 4.3 but has a header consistent with v4.3.

The reader uses a heuristic that pre-version 4.3 has a shorter cldhdr (15 bytes instead of 20 bytes). Perhaps, this was an over simplification. Using files available to me, I was able to determine that older versions of cldhdr did not include the version (i.e., "_V4.3" or "_V5.3"). According to the manuals, the older versions had only three variables (CLOUD, PRECIP, COD) and the new files had five (CLOUD, RAIN, SNOW, GRAUPEL, COD).[1]

Your cloud rain cldhdr contains the version (i.e., "_V4.3"), but has only three variables. It is not yet clear to me if this is a fluke or if all newer inputs that use the old algorithm will have only 3 variables.

For now, you have several options:

  1. Create the files without _V4.3 in the file header.
  2. Replace your version of PseudoNetCDF/camxfiles/cloud_rain/Memmap.py with the attached file. The attachment is Memmap.txt, so be careful to rename it if you copy it into your installation. Note that you will have to close python and restart to get the updated code.

Please let me know which one you do and let me know how you created or received these files.

[1] http://www.camx.com/files/camxusersguide_v5-40.pdf pages 5-35 to 5-36 or pdf pages 147-148

Attachment: Memmap.txt

ZhongSiming commented 4 years ago

Thanks for the quick turn on this! I copied the new Memmap into the appropriate directory and the files read without any problems. I obtained the files from the CMU Center for Atmospheric Particle Studies . I've asked my contact for information on how the files were created and will post his answer when I receive it.

ZhongSiming commented 4 years ago

So, it turns out the person I received the files from had "hacked" Environ's wrf-camx program in order to make the files work better with PMCAMx.

See the commits here: https://bitbucket.org/pablogrb/wrfpmcamx_3.4/commits/032e67df1580f0164ca1bec94687674b97b31150

and here https://bitbucket.org/pablogrb/wrfpmcamx_3.4/commits/9e226ca413356fe1a1ab0ea47317ff3183d989e6