blaylockbk / goes2go

Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.
https://goes2go.readthedocs.io/
MIT License
190 stars 33 forks source link

Errors when using ABI-L1b-Rad #25

Closed csteele2 closed 2 years ago

csteele2 commented 2 years ago

Perhaps a unique issue with arm64 or noarch packages on apple M1, but I ran into trouble using L1b data (need that higher resolution to look at dust)

When using goes_latest:

from goes2go.data import goes_latest, goes_nearesttime

sat = goes_latest(satellite='17',  product='ABI-L1b-Rad', domain='C')
    📦 Finished downloading [16] files to [/Users/csteele/data/noaa-goes17/ABI-L1b-RadC].
    Output exceeds the size limit. Open the full output data in a text editor
    ---------------------------------------------------------------------------
    PicklingError                             Traceback (most recent call last)
    /Users/csteele/goes2go/goes2go.ipynb Cell 2 in <cell line: 3>()
          1 #time = '2022-02-15 19:00'
          2 #sat = goes_nearesttime(time, satellite='17', product='ABI-L1b-Rad', domain='C', overwrite=True)
    ----> 3 sat = goes_latest(satellite='17',  product='ABI-L1b-Rad', domain='C')

    File ~/miniforge3/lib/python3.9/site-packages/goes2go/data.py:516, in goes_latest(satellite, product, domain, return_as, download, overwrite, save_dir, bands, s3_refresh, verbose)
        514     return df
        515 elif return_as == "xarray":
    --> 516     return _as_xarray(df, **params)

    File ~/miniforge3/lib/python3.9/site-packages/goes2go/data.py:293, in _as_xarray(df, **params)
        290 inputs = [(src, save_dir, i, n) for i, src in enumerate(df.file, start=1)]
        292 with multiprocessing.Pool(cpus) as p:
    --> 293     results = p.starmap(_as_xarray_MP, inputs)
        294     p.close()
        295     p.join()

    File ~/miniforge3/lib/python3.9/multiprocessing/pool.py:372, in Pool.starmap(self, func, iterable, chunksize)
        366 def starmap(self, func, iterable, chunksize=None):
        367     '''
        368     Like `map()` method but the elements of the `iterable` are expected to
        369     be iterables as well and will be unpacked as arguments. Hence
        370     `func` and (a, b) becomes func(a, b).
    ...
         50     buf = io.BytesIO()
    ---> 51     cls(buf, protocol).dump(obj)
         52     return buf.getbuffer()

    PicklingError: Can't pickle <function _as_xarray_MP at 0x15e6bbca0>: import of module 'goes2go.data' failed`

A different error occurs using latesttime:

time = '2022-02-15 19:00'
sat = goes_nearesttime(time, satellite='17', product='ABI-L1b-Rad', domain='C', overwrite=True)
    ---------------------------------------------------------------------------
    InvalidIndexError                         Traceback (most recent call last)
    /Users/csteele/goes2go/goes2go.ipynb Cell 2 in <cell line: 2>()
          1 time = '2022-02-15 19:00'
    ----> 2 sat = goes_nearesttime(time, satellite='17', product='ABI-L1b-Rad', domain='C', overwrite=True)

    File ~/miniforge3/lib/python3.9/site-packages/goes2go/data.py:612, in goes_nearesttime(attime, within, satellite, product, domain, return_as, download, overwrite, save_dir, bands, s3_refresh, verbose)
        610 df = df.sort_values("start")
        611 df = df.set_index(df.start)
    --> 612 nearest_time_index = df.index.get_indexer([attime], method="nearest")
        613 df = df.iloc[nearest_time_index]
        614 df = df.reset_index(drop=True)

    File ~/miniforge3/lib/python3.9/site-packages/pandas/core/indexes/base.py:3442, in Index.get_indexer(self, target, method, limit, tolerance)
       3439 self._check_indexing_method(method, limit, tolerance)
       3441 if not self._index_as_unique:
    -> 3442     raise InvalidIndexError(self._requires_unique_msg)
       3444 if not self._should_compare(target) and not is_interval_dtype(self.dtype):
       3445     # IntervalIndex get special treatment bc numeric scalars can be
       3446     #  matched to Interval scalars
       3447     return self._get_indexer_non_comparable(target, method=method, unique=True)

    InvalidIndexError: Reindexing only valid with uniquely valued Index objects`
blaylockbk commented 2 years ago

Hi @weathaman2132

The default behavior of goes_latest and goes_nearesttime is to read a file into an xarray DataFrame. It's not clearly documented, but since the ABI-L1b-Rad product has multiple channels, goes2go needs to know what "band" or channel you want to load.

This line will read the latest ABI-L1b-Rad file for Band 3 (note the argument "bands")

sat = goes_latest(satellite='17', product='ABI-L1b-Rad', domain='C', bands=3)

image

If you just want to download the file and not read it with xarray, you do this:

sat = goes_latest(satellite='17', product='ABI-L1b-Rad', domain='C', bands=3, return_as='filelist', download=True)

Side Note: The argument is named "bands" (plural) because you can use a list of integers to download multiple channels. But if you use a list when reading data with xarray, you get the errors you ran into. I don't like the argument name "bands" and want to change it to "channel" in the future.