tomthetrainer / June26Class

Materials for June 26 class
0 stars 0 forks source link

Basic DataVec for Sequences Lab? #5

Open tomthetrainer opened 7 years ago

tomthetrainer commented 7 years ago

@turambar @bpark738

I like to start with an early Lab, I will have them do simplest network that is basically the simplest NN you could imagine, that Lab is really just to verify they have a working environment and can use IntelliJ.

The second Lab I have is a DataVec Lab.

I demo a spark analysis from the demos section AbaloneDataTransform

For this class it might be nice to demo something more appropriate to time series.

After the demo I have a basic DataVec Lab,

The code is DataVecLab, basic ingest of iris csv data with a prebuilt Neural Network.

It might be nice to have a basic Sequence oriented DataVec Lab.

I could write it but if either of you wanted to summarize the issues with reading time series data and the datavec classes involved that might make a nice intro.

turambar commented 7 years ago

@tomthetrainer perhaps skim our physionet examples and see if the datavec stuff there suffices?

Briefly, I think the main challenges are:

Any other comments @bpark738?

tomthetrainer commented 7 years ago

@turambar do you have content that describes these issues? Maybe slides from your GTC presentation? I would like to summarize what you just said into some content for the class

turambar commented 7 years ago

Alas no, but I can generate some over next day or so.

bpark738 commented 7 years ago

Yeah not sure how to deal with individual files for Spark. I'll message Alex Black and see if he recommends anything.

The only other challenge I can think of is the training time required for LSTM's, especially if there are a high number of time steps. This might make using Spark more critical.