Update 'Baseline Results' for 'write-up.docx'

jeff1evesque / ist-736

Syracuse IST-736 Final Project

1 stars 0 forks source link

Update 'Baseline Results' for 'write-up.docx' #100

Closed jeff1evesque closed 4 years ago

jeff1evesque commented 5 years ago

We need to update our "Baseline Results" for our write-up.docx.

jeff1evesque commented 5 years ago

We need to update our visualizations to correspond to our earlier classifiers having a potentially reduced/filtered dataset.

jeff1evesque commented 5 years ago

We need to update our README.md, and discuss the grid search implementation for our ARIMA model.

jeff1evesque commented 5 years ago

Currently, our grid-search ARIMA model is generating an SVD error after a few models have been successfully trained:

    raise LinAlgError("SVD did not converge")
numpy.linalg.LinAlgError: SVD did not converge

Therefore, we may consider taking a log transform of the data before train. This means we'll need a function that can back-transform the data, or prediction to the expected scale.

jeff1evesque commented 5 years ago

We should discuss the use of the grid search, as well as the implemented log-transform to better ensure convergence. Specifically, the log transform was utilized to decrease variability in the data, and potentially assisting for stationarity. Additionally, a small added factor 0.01 was added to each time series value (each value range(0, 1)), prior to modeling to eliminate the possibility of log(0), which breaks the corresponding modeling.

jeff1evesque commented 5 years ago

We need to mention the implementation of our dynamic auto_scale for the ts_index arima model.

jeff1evesque commented 5 years ago

We could mention that the efforts integrated into the ARIMA model to better ensure stationarity, could be applied on the time series data prior to our peak detection methodology.

jeff1evesque commented 5 years ago

729f6c1, 541fb53: even though we implemented our grid-search from our model/timeseries.py on our training data, then performed a rolling prediction on the remaining test dataset:

        # induce stationarity
        result = model.grid_search(auto_scale=auto_scale)

Later datasets (i.e. bats--bats_mmt) failed to converge:

[...INITIAL-TRACE-OMITTED...]
    raise LinAlgError("SVD did not converge")
numpy.linalg.LinAlgError: SVD did not converge

Therefore, we'll implement our custom model.grid_search() iteratively for each rolling time-series prediction. This will be computatively expensive and done within our algorithm/arima.py. While a more dynamic solution will require more thought, it is likely not worth the extra effort.

jeff1evesque commented 5 years ago

Furthermore, we'll need to update our write-up.docx to discuss that a rolling grid-search implementation was performed for the arima modeling.

jeff1evesque commented 5 years ago

78e11aa: we should also state if our arima model does not converge (even with the rolling grid-search), we will catch the error, then throw out the current modeling attempt.

jeff1evesque commented 5 years ago

ab18722: the rolling_grid_search is computatively too expensive. Rather, we will implement the catch_grid_search only when an exception is raised during model.fit. If the successive grid-search fails, then the modeling attempt will finally be caught, and thrown out.