WillKoehrsen / Data-Analysis

Data Science Using Python
https://medium.com/@williamkoehrsen/
MIT License
5.16k stars 3.64k forks source link

Change Stocker to avoid warning errors. #22

Closed QSCTech-Sange closed 5 years ago

QSCTech-Sange commented 5 years ago

According to the lastest Pandas, .ix format is deprecated. And it will cause warnings if I follow the instructions. I change the .ix format to .loc format to avoid it.

Also, I find it really hard to install fbprophet on Windows10. But by "conda install -c conda-forge fbprophet" it can be installed easily. So I add it in README.md.

It's my first time to pull request and plz forgive me if I did something wrong. Really love this project. :)

WillKoehrsen commented 5 years ago

Thanks for this. I'll have to review the code, but you are right that .ix should no longer be used for indexing in Pandas (instead it should be .loc or .iloc). Also, installing fbprophet on Windows can be quite a challenge. Thanks for addressing both of these points!

WillKoehrsen commented 5 years ago

I think the best way for accessing the last row of a dataframe using .loc might be df.loc[df.index[-1], 'adj_close']. This seems a little more stable than df.loc[len(df) - 1, 'adj_close'] because the index might not be numeric. This was originally my error!

We could also use df.iloc[-1, :] although that doesn't get us the column by name. Another option is df.tail(1) and then subsetting to the column. (df.tail(1)['adj_close'])

QSCTech-Sange commented 5 years ago

I totally agree with the .loc part, so I changed them to df.loc[len(df) - 1, 'adj_close'] as you mentioned. And I find some other usages of Pandas and Matplotlib are deprecated. First it occurs that

"FutureWarning: Comparing Series of datetimes with 'datetime.date'. Currently, the 'datetime.date' is coerced to a datetime. In the future pandas will not coerce, and a TypeError will be raised. To retain the current behavior, convert the 'datetime.date' to a datetime with 'pd.Timestamp'."

And I remove the .date() to solve it.

Also, It happens that

"MatplotlibDeprecationWarning: examples.directory is deprecated; in the future, examples will be found relative to the 'datapath' directory."found relative to the 'datapath' directory.".format(key))"

So I replace the matplotlib.rcParams.update(matplotlib.rcParamsDefault) to matplotlib.rcdefaults() to avoid it. That's the changes in this Pull Request.

QSCTech-Sange commented 5 years ago

Some of the .date() are not removed completely. I removed them. And also, I encounter a problem that when i use evaluate_prediction

RuntimeWarning: invalid value encountered in sign test['correct'] = (np.sign(test['pred_diff']) == np.sign(test['real_diff'])) * 1

I find out that it happens due to the first row is nan, so i change it to test['correct'] = (np.sign(test['pred_diff'][1:]) == np.sign(test['real_diff'][1:])) * 1

It seems you are busy. Good luck!