Closed wkearn closed 7 years ago
Thanks, Will! Thanks for catching this -- I basically haven't run yatsm batch
since I (re)started mucking with the IO API. The longer term goal of this submodule is basically a set of "drivers" that wrap individual data providers so the rest of YATSM can use a familiar API to reach data from USGS ARD data sources, the Australian Geoscience Data Cube, "stacked" data that we've traditionally used, and really anything that someone wants to use. I suspect the exact definition of the API will be in flux a bit until I get into implementing one of these other sources, but for getting to v0.7.0
I'll just stick with what seems to model the GDAL access route well enough.
I'm curious what you think about the following:
bands
for the read_dataarray
call because it's shorter and matches better with indexes
(which rasterio.DatasetReader.read
uses as argument, so indexes
should be familiar)band_names
, which I kinda like over just bands
because it implies str
inputsAt the risk of bike shedding... should they be the same name for consistency? If so, I think you just need to change the configuration file schema (here) to check for bands
instead of band_names
. I'm happy to merge now but just thought I'd ask you beforehand
Thanks! I'm really thrilled you took a look and would be happy to chat next week when I'm back.
Re bands
vs. band_names
: you'll have to tell me what the pythonic approach to these things is, but I would tend to prefer that the names indicate the types that you're expecting, which would prefer band_names
for read_dataarray
. To me, band_names
suggests an array of strings while bands
suggests something else: an array of Band
s whatever those might be? the arrays of data themselves? an array of band indexes (which is really just indexes
)?.
But I'm not super committed one way or the other. If I understand correctly, end users who are going to be writing the config files probably aren't going to call read_dataarray
themselves.
Re batch
: is there a minimal make-it-go example somewhere for now? I have a bunch of Landsat stacks and a YAML config file that roughly follows this example, and I'd like to run the pipeline just on my local machine.
I encountered this while hunting down some errors I get when trying to run
batch
. This is not the end of that struggle, but it seemed a pretty straightforward change. Let me know if I misunderstood what was supposed to be going on here.