GeoscienceAustralia / agdc

Repository for Australian Geoscience Data Cube (AGDC) code
BSD 3-Clause "New" or "Revised" License
29 stars 24 forks source link

modify the pixel time series tool to return a dataframe #66

Open sixy6e opened 9 years ago

sixy6e commented 9 years ago

The pixel time series tool could be made more functional by incorporating a function that returns a pandas.DataFrame rather than outputting directly to disk.

Some thing along the lines of:

dataframe = retrieve_pixel_time_series(coord=(x, y), dsets=[ARG25, FC25, PQ25], pq_flags=None)

where: coord: a tuple of x, y coordinates dsets: a list of DatasetTypes _pqflags: If None then no PQ masking is applied.

A pandas.DataFrame is returned containing the same info as currently output, but is time-series aware by having the _dataset.startdatetime timestamps set as the DataFrame index

The current class RetrievePixelTimeSeriesTool could make use of the same return structure i.e a pandas.DataFrame allowing multiple outputs formats to be written to disk such as csv, JSON & xls.

What do you think? I can draft up an example soonish.

simonoldfield commented 9 years ago

The first aspect of this is to move the logic of performing the actual pixel drill into the API proper - e.g. into the datacube.api.utils module like get_dataset_data_stack has been. I'm not sure if the API call should return a pandas.DataFrame or whether it would return some other data structure and then the Retrieve Pixel Time Series command line tool code would wrap it to produce the output format.

This has always been the intention - but it hasn't been a high enough priority to move from my TODO list to the TODONE list.

I'll look at this real soon now.