Closed gbrener closed 7 years ago
I'm not sure what the underlying intent is here, but if you're just generating all these notebooks in order to run a similar set of steps on different data files, @jlstevens can show you how to do that using a single reference .ipynb file with a filename parameter widget, set up so that a continuous integration system can run that notebook for every file in a directory or every URL in a list.
@jbednar The primary intent is data exploration for the purposes of feature engineering. I'll update the PR description to reflect this. I haven't analyzed the data enough to determine whether we should be treating every file type the same way - this was just a naive starting point. I'll get more information from @jlstevens about the parameter widget to see how it can help us here.
@gbrener I'll push up some changes in about an hour, then modify as needed. I'll just edit the one main notebook that gets copied currently.
Just pushed up an improvement to the notebooks which makes visualizing a bit easier, and solves the problem of not knowing how many attributes the data files have ahead of time. Here is what the first visualizations now look like:
Adjusted the undefined data values, so the visualizations look more accurate now:
These values should be ignored during analysis (see https://hydro1.gesdisc.eosdis.nasa.gov/data/NLDAS/NLDAS_FOR0125_H.001/doc/README.NLDAS1.pdf).
Pushing up changes in the next few minutes.
Just pushed up the one-notebook approach, using ipywidgets via the functions in example_utils.py
. Updated this PR's description to reflect the new approach. Here is a screencast of what the data file selection looks like:
To aid with feature engineering in applying ML to NLDAS, I've created
severala notebook with visualizations showing one or more publically-available GRIB files.