TUW-GEO / ascat

Read and visualize data from the Advanced Scatterometer (ASCAT) on-board the series of Metop satellites
https://ascat.readthedocs.io/
MIT License
25 stars 16 forks source link

Rewrite cell/swath xarray readers as MultiFileHandlers #64

Open claytharrison opened 6 months ago

claytharrison commented 6 months ago

This pull request aims to reimplement the reading/merging logic for swath and cell files in the structure established by MultiFileHandler/ChronFiles/etc in the file_handling module.

On this commit, readers for cell files are implemented (RaggedArray and OrthoMulti). The most basic method of operation goes something like:

from ascat.read_native.cell_collection import RaggedArrayFiles, OrthoMultiArrayFiles
contiguous_ra_source = "/path/to/contiguous/sig0_12.5/metop_a"
indexed_ra_source = "/path/to/indexed/sig0_12.5/metop_a"
multisat_ra_source = "/path/to/indexed/sig0_12.5/"
orthomulti_source = "/path/to/era5_land_2023/"
orthomulti_grid = "/path/to/era5_land_2023/grid.nc"

# amazon chunk
# you can also query by list of location_id, cell number, or lon/lat coords
bbox = (-7, -4, -69, -65)

contiguous_ra_files = RaggedArrayFiles(contiguous_ra_source, product_id="sig0_12.5") 
indexed_ra_files = RaggedArrayFiles(indexed_ra_source, product_id="sig0_12.5")

# right now we just use the "all_sats" parameter to indicate if the files are nested within metop_a/metop_b/metop_c directories underneath
# the root dir. This is of course not general or ideal.
multisat_ra_files = RaggedArrayFiles(multisat_ra_source, product_id="sig0_12.5", all_sats=True)

# for orthomulti right now you just pass the grid file path as an argument and it will generate a pygeogrids object from that.
# the product_id doesn't do anything in this case.
orthomulti_files = OrthoMultiArrayFiles(orthomulti_source, product_id="this_doesnt_matter_in_this_case", grid=orthomulti_grid)

# extract the data

contiguous_ra_ds = contiguous_ra_files.extract(bbox=bbox)
indexed_ra_ds = indexed_ra_files.extract(bbox=bbox)
# ^ these two should be the same, since contiguous RAs are converted to indexed before merging

multisat_ra_ds = multisat_ra_files.extract(bbox=bbox)

orthomulti_ds = orthomulti_files.extract(bbox=bbox)

To do:

claytharrison commented 6 months ago

I added a basic Swath reader but nothing for handling specific products yet. For now you can steal the information for a given product from xarray_io.py.

It tries to implement a spatial filter for the results of the time-based file search, to relatively quickly exclude unnecessary swath files from reading and merging. The concept was graciously stolen from a script of Pavan's. It seems like it works but I haven't done proper testing yet.

Using it should go something like -

from ascat.read_native.swath_collection import SwathFile
from ascat.read_native.swath_collection import SwathGridFiles
from fibgrid.realization import FibGrid

swath_path = "tests/ascat_test_data/hsaf/h129/swaths"
grid = FibGrid(6.25)
sf = SwathGridFiles(
    swath_path,
    cls=SwathFile,
    fn_templ="W_IT-HSAF-ROME,SAT,SSM-ASCAT-METOP{sat}-6.25-H129_C_LIIB_{date}_{placeholder}_{placeholder1}____.nc",
    sf_templ={"year_folder": "{year}"},
    grid=grid,
    fn_read_fmt= lambda timestamp: {
        "date": timestamp.strftime("%Y%m%d*"),
        "sat": "[ABC]",
        "placeholder": "*",
        "placeholder1": "*"
    },
    sf_read_fmt = lambda timestamp:{
        "year_folder": {
            "year": f"{timestamp.year}"
        },
    },
)
files = sf.search_period(
    datetime(2021, 1, 15),
    datetime(2021, 1, 30),
    date_field_fmt="%Y%m%d%H%M%S"
)
bbox=(-90, -4, -70, 20)

merged_ds = sf.extract(
    datetime(2021, 1, 15),
    datetime(2021, 1, 30),
    bbox = bbox,
    date_field_fmt="%Y%m%d%H%M%S"
)