NASA-IMPACT / eclipse-che-jupyterhub-deployment

MIT License
0 stars 2 forks source link

Get an understanding from an EIS user group of their use cases to determine MVP features for the VEDA AP #73

Closed abarciauskas-bgse closed 2 years ago

abarciauskas-bgse commented 2 years ago
abarciauskas-bgse commented 2 years ago

I took these rough notes from our meeting Notes meeting with Eli Orland 08/25

j08lue commented 2 years ago

FWIW, here is a synthesis of Eli's workflow from a technical perspective.

High-level flow: Use MTBS as labels for generating training features from EVT data.

In order to run Eli's workflow in a cloud environment, we need to support the following steps:

  1. Load annual MTBS raster data (third-party) [cloud]
  2. Load annual EVT raster and vector data (third-party) [cloud]
  3. Load fire boundaries vector data (user-provided - I assume) [cloud]
  4. Subset MBTS and EVT to fire boundaries [local or cloud]
  5. Reproject rasters to be pixel-aligned [local]
  6. Combine all pixels and metadata into a tabular feature data store (Pandas dataframe, maybe persisted somehow) for subsequent model training and inference [local]
  7. Train model on feature data. [local]

Not sure what exactly the model is supposed to predict / classify and whether the result should be published anywhere. But the key job of this workflow is to generate the training features by combining and selecting input data.

The input data sources are described here: https://docs.google.com/spreadsheets/d/1uCDLYUUkSBhKBAmkkaPIxmB65W2HjaDLHRStNSmQTbQ/edit?usp=sharing

abarciauskas-bgse commented 2 years ago

Nice thanks for doing this evaluation @j08lue

j08lue commented 2 years ago

Next step: A PoC JupyterLab environment where Eli's workflow can run, perhaps with some modifications to load data from the VEDA data store instead of from local disk.