pnuu commented 2 years ago

When saving multiple datasets to a single CF/NetCDF4 file using the syntax introduced in https://github.com/pytroll/trollflow2/pull/51 , I got random crashes resulting in RuntimeError: NetCDF: Not a valid ID within the XArray library. Some internet searching suggested that this is due to trying to use dimensions that have not yet been defined.

My solution to this is adding a config option that forces an eager saving instead of delaying the saving and calling compute_writer_results() afterwards. With this PR, I haven't seen this happen a single time over 50 consecutive runs.

[x] Tests added
[ ] Tests passed
[x] Passes flake8 trollflow2
[x] Fully documented

codecov[bot] commented 2 years ago

Codecov Report

Merging #138 (d4eb0ca) into main (73c9e35) will decrease coverage by 0.07%. The diff coverage is 100.00%.

:exclamation: Current head d4eb0ca differs from pull request most recent head dbd4e5d. Consider uploading reports for the commit dbd4e5d to get more accurate results

@@            Coverage Diff             @@
##             main     #138      +/-   ##
==========================================
- Coverage   95.56%   95.48%   -0.08%     
==========================================
  Files          11       11              
  Lines        2365     2372       +7     
==========================================
+ Hits         2260     2265       +5     
- Misses        105      107       +2

Flag	Coverage Δ
unittests	`95.48% <100.00%> (-0.08%)`	:arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
trollflow2/plugins/__init__.py	`92.84% <100.00%> (-0.38%)`	:arrow_down:
trollflow2/tests/test_trollflow2.py	`99.47% <100.00%> (+<0.01%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 97882c0...dbd4e5d. Read the comment docs.

pnuu commented 2 years ago

Here's a short scipt using Satpy directly that demonstrates the random crashes:

#!/usr/bin/env python

import glob

import numpy as np
from satpy import Scene
from satpy.writers import compute_writer_results

COMPUTE = False

def main():
    fnames = glob.glob(
        "/home/lahtinep/data/satellite/polar/ears_pps/*48965.nc")
    dsets = ['cma', 'ct', 'ctth_alti', 'ctth_pres', 'ctth_tempe']
    glbl = Scene(reader='nwcsaf-pps_nc', filenames=fnames)
    glbl.load(dsets)
    dtype_int16 = {'dtype': np.int16}
    encoding = {'cma': dtype_int16, 'ct': dtype_int16}
    res = glbl.save_datasets(writer='cf', filename="/tmp/pps_test.nc",
                             encoding=encoding, include_lonlats=True, compute=COMPUTE)
    if not COMPUTE:
        compute_writer_results([res])

if __name__ == "__main__":
    main()

I'm trying to find an example that wouldn't require actual data.

pnuu commented 2 years ago

And the corresponding version as trollflow2.yaml:

product_list:

  output_dir:
    /tmp/
  fname_pattern:
    "{start_time:%Y%m%d_%H%M}_{platform_name}_{areaname}_EARS_PPS.nc"
  reader: nwcsaf-pps_nc
  subscribe_topics:
    - /test/ears/avhrr/pps/gatherer
  eager_writing: True

  areas:
    null:
      priority: 1
      areaname: swath
      products:
        ("cma", "ct", "ctth_alti", "ctth_pres", "ctth_tempe"):
          formats:
          - format: nc
            writer: cf
            encoding:
              cma:
                dtype: !!python/name:numpy.int16
              ct:
                dtype: !!python/name:numpy.int16
            include_lonlats: True

workers:
  - fun: !!python/name:trollflow2.plugins.create_scene
  - fun: !!python/name:trollflow2.plugins.load_composites
  - fun: !!python/name:trollflow2.plugins.resample
  - fun: !!python/name:trollflow2.plugins.save_datasets

pnuu commented 2 years ago

Another pure-Satpy example using SEVIRI HRIT data:


#!/usr/bin/env python

import glob

from satpy import Scene
from satpy.writers import compute_writer_results

COMPUTE = False

def main():
    fnames = glob.glob(
        "/home/lahtinep/data/satellite/geo/0deg/*202202031045*")
    dsets = ['VIS006', 'VIS008']
    glbl = Scene(reader='seviri_l1b_hrit', filenames=fnames)
    glbl.load(dsets)
    res = glbl.save_datasets(writer='cf', filename="/tmp/seviri_test.nc",
                             include_lonlats=True, compute=COMPUTE)
    if not COMPUTE:
        compute_writer_results([res])

if __name__ == "__main__":
    main()

pnuu commented 2 years ago

UPDATED VERSION

And a version that fails without any Pytroll code:

#!/usr/bin/env python

import datetime as dt

import numpy as np
import dask.array as da
import xarray as xr

COMPUTE = False
FNAME = "/tmp/xr_test.nc"

def main():
    y = np.arange(1000, dtype=np.uint16)
    x = np.arange(2000, dtype=np.uint16)
    now = dt.datetime.utcnow()
    times = xr.DataArray(np.array([now + dt.timedelta(seconds=i) for i in range(y.size)], dtype=np.datetime64),
                         coords={'y': y})

    # Write root
    root = xr.Dataset({}, attrs={'global': 'attribute'})
    written = [root.to_netcdf(FNAME, mode='w')]

    # Write first dataset
    data1 = xr.DataArray(da.random.random((y.size, x.size)), dims=['y', 'x'],
                         coords={'y': y, 'x': x, 'time': times})
    dset1 = xr.Dataset({'data1': data1})
    written.append(dset1.to_netcdf(FNAME, mode='a', compute=COMPUTE))

    # Write second dataset using the same time coordinates
    data2 = xr.DataArray(da.random.random((y.size, x.size)), dims=['y', 'x'],
                         coords={'y': y, 'x': x, 'time': times})
    dset2 = xr.Dataset({'data2': data2})
    written.append(dset2.to_netcdf(FNAME, mode='a', compute=COMPUTE))

    if not COMPUTE:
        da.compute(written)

if __name__ == "__main__":
    main()

pnuu commented 2 years ago

Created an issue to XArray: https://github.com/pydata/xarray/issues/6300

pnuu commented 2 years ago

I added a note to the example configuration file. This will most likely be a temporary solution until https://github.com/pydata/xarray/issues/6300 is fixed.

pytroll / trollflow2

Make it possible to do eager saving #138

Codecov Report