Enhanced machine learning library tailored for data streams, featuring a Python API integrated with MOA backend support. This unique combination empowers users to leverage a wide array of existing algorithms efficiently while fostering the development of new methodologies in both Python and Java.
plot_predictions_vs_ground_truth only works with stream_from_file() when a CSV is used as below. If we use an ARFF or the datasets functionality (which downloads an ARFF version of the data) we face an error that most likely originates on the prequential_evaluation function and becomes evident on the plot_predictions_vs_ground_truth function.
Once this is corrected we will need to update the tutorial 01_evaluation.ipynb
from capymoa.evaluation import prequential_evaluation
from capymoa.evaluation.visualization import plot_predictions_vs_ground_truth
from capymoa.regressor import KNNRegressor, AdaptiveRandomForestRegressor
from capymoa.stream import stream_from_file
stream = stream_from_file(path_to_csv_or_arff="../data/fried.csv", enforce_regression=True)
kNN_learner = KNNRegressor(schema=stream.get_schema(), k=5)
ARF_learner = AdaptiveRandomForestRegressor(schema=stream.get_schema(), ensemble_size=10)
# When we specify store_predictions and store_y, the results will also include all the predictions and all the ground truth y.
# It is useful for debugging and outputting the predictions elsewhere.
kNN_results = prequential_evaluation(stream=stream, learner=kNN_learner, window_size=5000, store_predictions=True, store_y=True)
# We don't need to store the ground-truth for every experiment, since it is always the same for the same stream
ARF_results = prequential_evaluation(stream=stream, learner=ARF_learner, window_size=5000, store_predictions=True)
# Plot only 200 predictions (see plot_interval)
plot_predictions_vs_ground_truth(kNN_results, ARF_results, ground_truth=kNN_results['ground_truth_y'], plot_interval=(0, 200))
plot_predictions_vs_ground_truth
only works withstream_from_file()
when a CSV is used as below. If we use an ARFF or the datasets functionality (which downloads an ARFF version of the data) we face an error that most likely originates on theprequential_evaluation
function and becomes evident on theplot_predictions_vs_ground_truth
function. Once this is corrected we will need to update the tutorial01_evaluation.ipynb