Open zouharvi opened 5 years ago
Hi @zouharvi, it looks to me but just to be sure, have you been able to reproduce past experiments with the newest version? Thanks, Fred.
I'm looking into it right now. It appears as if something is broken. For the command python src/learn_model.py config/svr.cfg
the original output is:
INFO:root:mae = 0.6821668200788067
INFO:root:rmse = 0.8138935136352005
INFO:root:pearson_corrcoef = 0.5635631854543511 9.652977267423245e-37
while the new output is:
INFO:root:mae = 358.017652877413
INFO:root:rmse = 0.8981408837857772
INFO:root:pearson_corrcoef = 0.49060674728509823 6.075079526413846e-27
I found a bug in the metric creation. The test output now looks like this. RMSE and Pearson Correlation Coefficient is correct, even the predicted values are the same, but MAE is way off. I'll look into it further.
INFO:root:mae = 382.47200100879235
INFO:root:rmse = 0.8138935136352005
INFO:root:pearson_corrcoef = 0.5635631854543511 9.652977267423245e-37
What do you get if you replace the computation of MAE, currently in learning/src/evaluation_measures.py
, by the implementation from sklearn:
from sklearn.metrics import mean_absolute_error
?
@fredblain
This was the culprit. I didn't know it was implemented outside of the library. The outputs (metrics and data) for the svr.cfg
and crf.cfg
(with WMT2012 data) is now identical.
@zouharvi, I don't know why it has been implemented that way as MAE seems to be present in sklearn v0.15. @jsouza, do you remember the motivation behind it by any chance (although I know it's been a while.. ;)?
Scikit-version 0.15.2 is outdated at this time:
scikit-learn
installation with other projects (this can be solved by using vendor prefixes, but it's extra work)There are some API breaking changes when migrating from
0.15.2
to0.20.3
(latest available on public PIP) which this pull request aims to fix.