alkaline-ml / pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
https://www.alkaline-ml.com/pmdarima
MIT License
1.58k stars 234 forks source link

Lack of documentation #31

Closed mfatihaktas closed 6 years ago

mfatihaktas commented 6 years ago

I am having hard time learning and using some of the functionality provided by the library.

For instance, how can we save the model that is returned by auto_arima, say to a file, then reuse it? I read in quick_start_example.ipynb that models are fully picklable but it would have been great if these utilities are at least explained with a simple example, or it would be even better if a documentation is made available.

tgsmith61591 commented 6 years ago

"...it would be even better if a documentation is made available."

Documentation has been made available, and it's linked at the top of the repo:

top-of-repo

To save you the trouble, here is the documentation for the ARIMA estimator class, and here is the documentation for the auto-ARIMA selection function. And if you still want more, here is the API ref, which links to all the documentation of each method.

"...how can we save the model that is returned by auto_arima, say to a file, then reuse it?"

Pickling is handled the same way for classes here that it is for any other python object. Namely, pickle.dump:

from pyramid.arima import auto_arima
from pyramid.datasets import load_lynx
import numpy as np

# For serialization:
from sklearn.externals import joblib
import pickle

# Load data and fit a model
y = load_lynx()
arima = auto_arima(y, seasonal=True)

# Serialize with Pickle, read it back and make a prediction
with open('arima.pkl', 'wb') as pkl:
    pickle.dump(arima, pkl)
with open('arima.pkl', 'rb') as pkl:
    pickle_preds = pickle.load(pkl).predict(n_periods=5)

# Or maybe joblib tickles your fancy
joblib.dump(arima, 'arima.pkl')
joblib_preds = joblib.load('arima.pkl').predict(n_periods=5)

# show they're the same
np.allclose(pickle_preds, joblib_preds)

If your question is really more about serialization, you should see this Stack Overflow question. This is a pretty standard serialization pattern for Python, which is why it wasn't explicitly documented.

Final thoughts

I'll concede to you that the library could use more examples, and that's something I can put together, but keep in mind, I'm one guy with a full-time job and I do this for nothing but my own amusement. I'm always happy to field issues as they come up, and I've integrated feature requests that have been requested by others in the past, but I've got to say... the way this issue was worded feels a little bit demanding. It's a bit silly to make demands of software you've gotten for free.

That said, if you end up adding an example or documentation that you think would be valuable before I get a chance, please do submit a PR. Otherwise I'll get to it when I can.

mfatihaktas commented 6 years ago

Thanks for taking the time to reply. Apologies if I came across rude, did not mean to be demanding really. I wrote without putting too much thought into it because I was trying to gauge if I can rely on this library for building what I need, or it is time for me to start using R, which I have been avoiding ...

It turns out this library works good enough and thank you for making it available open source. Also, looking into the source code (which I should have done before creating the issue I guess), it is very well commented, which is excellent and enough for my needs.

tgsmith61591 commented 6 years ago

No worries at all, and I'm glad the package meets your needs! It will continue to mature, and hopefully people have one less reason to ever need R. :-)