smartcorelib / smartcore

A comprehensive library for machine learning and numerical computing. The library provides a set of tools for linear algebra, numerical computing, optimization, and enables a generic, powerful yet still efficient approach to machine learning.
https://smartcorelib.org/
Apache License 2.0
706 stars 77 forks source link

Instability in `svm::svc::tests::svc_fit_predict` -test #158

Closed titoeb closed 2 years ago

titoeb commented 2 years ago

While working with on the library I have seen the svm::svc::tests::svc_fit_predict-test fail randomly from time to time. Here an example from a CI run, but it also happened locally in a linux container on my development machine.

Here the log of the testing suite.

test algorithm::neighbour::bbd_tree::tests::bbdtree_iris ... ok
test algorithm::neighbour::cover_tree::tests::cover_tree_test ... ok
test algorithm::neighbour::cover_tree::tests::cover_tree_test1 ... ok
test algorithm::neighbour::cover_tree::tests::serde ... ok
test algorithm::neighbour::fastpair::tests_fastpair::dataset_has_at_least_three_points ... ok
test algorithm::neighbour::fastpair::tests_fastpair::fastpair_closest_pair ... ok
test algorithm::neighbour::fastpair::tests_fastpair::fastpair_distances ... ok
test algorithm::neighbour::fastpair::tests_fastpair::fastpair_init ... ok
test algorithm::neighbour::fastpair::tests_fastpair::fastpair_new ... ok
test algorithm::neighbour::fastpair::tests_fastpair::one_dimensional_dataset_2 ... ok
test algorithm::neighbour::fastpair::tests_fastpair::one_dimensional_dataset_minimal ... ok
test algorithm::neighbour::linear_search::tests::knn_find ... ok
test algorithm::neighbour::linear_search::tests::knn_point_eq ... ok
test algorithm::sort::heap_select::tests::test_add ... ok
test algorithm::sort::heap_select::tests::test_add1 ... ok
test algorithm::sort::heap_select::tests::test_add2 ... ok
test algorithm::sort::heap_select::tests::test_add_ordered ... ok
test algorithm::sort::heap_select::tests::with_capacity ... ok
test algorithm::sort::quick_sort::tests::with_capacity ... ok
test cluster::dbscan::tests::fit_predict_dbscan ... ok
test cluster::dbscan::tests::serde ... ok
test cluster::kmeans::tests::fit_predict_iris ... ok
test cluster::kmeans::tests::invalid_k ... ok
test cluster::kmeans::tests::serde ... ok
test dataset::boston::tests::boston_dataset ... ok
test dataset::boston::tests::refresh_boston_dataset ... ignored
test dataset::breast_cancer::tests::cancer_dataset ... ok
test dataset::breast_cancer::tests::refresh_cancer_dataset ... ignored
test dataset::diabetes::tests::boston_dataset ... ok
test dataset::diabetes::tests::refresh_diabetes_dataset ... ignored
test dataset::digits::tests::digits_dataset ... ok
test dataset::digits::tests::refresh_digits_dataset ... ignored
test dataset::generator::tests::test_make_blobs ... ok
test dataset::generator::tests::test_make_circles ... ok
test dataset::generator::tests::test_make_moons ... ok
test dataset::iris::tests::iris_dataset ... ok
test dataset::iris::tests::refresh_iris_dataset ... ignored
test dataset::tests::as_matrix ... ok
test decomposition::pca::tests::decompose_correlation ... ok
test decomposition::pca::tests::decompose_covariance ... ok
test decomposition::pca::tests::pca_components ... ok
test decomposition::pca::tests::serde ... ok
test decomposition::svd::tests::serde ... ok
test decomposition::svd::tests::svd_decompose ... ok
test ensemble::random_forest_classifier::tests::fit_predict_iris ... ok
test ensemble::random_forest_classifier::tests::fit_predict_iris_oob ... ok
test ensemble::random_forest_classifier::tests::serde ... ok
test algorithm::neighbour::fastpair::tests_fastpair::fastpair_closest_pair_random_matrix ... ok
test ensemble::random_forest_regressor::tests::fit_longley ... ok
test ensemble::random_forest_regressor::tests::serde ... ok
test linalg::cholesky::tests::cholesky_decompose ... ok
test linalg::cholesky::tests::cholesky_solve_mut ... ok
test linalg::evd::tests::decompose_asymmetric ... ok
test linalg::evd::tests::decompose_complex ... ok
test linalg::evd::tests::decompose_symmetric ... ok
test linalg::lu::tests::decompose ... ok
test linalg::lu::tests::inverse ... ok
test linalg::naive::dense_matrix::tests::ab ... ok
test linalg::naive::dense_matrix::tests::approximate_eq ... ok
test linalg::naive::dense_matrix::tests::col_matrix_to_row_vector ... ok
test linalg::naive::dense_matrix::tests::col_mean ... ok
test linalg::naive::dense_matrix::tests::copy_from ... ok
test linalg::naive::dense_matrix::tests::cov ... ok
test linalg::naive::dense_matrix::tests::dot ... ok
test linalg::naive::dense_matrix::tests::eye ... ok
test linalg::naive::dense_matrix::tests::from_array ... ok
test linalg::naive::dense_matrix::tests::from_to_row_vec ... ok
test linalg::naive::dense_matrix::tests::get_row ... ok
test linalg::naive::dense_matrix::tests::h_stack ... ok
test linalg::naive::dense_matrix::tests::iter ... ok
test linalg::naive::dense_matrix::tests::matmul ... ok
test linalg::naive::dense_matrix::tests::min_max_sum ... ok
test linalg::naive::dense_matrix::tests::norm ... ok
test linalg::naive::dense_matrix::tests::rand ... ok
test linalg::naive::dense_matrix::tests::reshape ... ok
test linalg::naive::dense_matrix::tests::row_column_vec_from_array ... ok
test linalg::naive::dense_matrix::tests::slice ... ok
test linalg::naive::dense_matrix::tests::softmax_mut ... ok
test linalg::naive::dense_matrix::tests::to_from_bincode ... ok
test linalg::naive::dense_matrix::tests::to_from_json ... ok
test linalg::naive::dense_matrix::tests::to_string ... ok
test linalg::naive::dense_matrix::tests::transpose ... ok
test linalg::naive::dense_matrix::tests::v_stack ... ok
test linalg::naive::dense_matrix::tests::vec_approximate_eq ... ok
test linalg::naive::dense_matrix::tests::vec_copy_from ... ok
test linalg::naive::dense_matrix::tests::vec_dot ... ok
test linalg::nalgebra_bindings::tests::abs_mut ... ok
test linalg::nalgebra_bindings::tests::add_sub_mul_div ... ok
test linalg::nalgebra_bindings::tests::approximate_eq ... ok
test linalg::nalgebra_bindings::tests::argmax ... ok
test linalg::nalgebra_bindings::tests::col_matrix_to_row_vector ... ok
test linalg::nalgebra_bindings::tests::col_mean ... ok
test linalg::nalgebra_bindings::tests::copy_from ... ok
test linalg::nalgebra_bindings::tests::copy_row_col_as_vec ... ok
test linalg::nalgebra_bindings::tests::dot ... ok
test linalg::nalgebra_bindings::tests::element_add_sub_mul_div ... ok
test linalg::nalgebra_bindings::tests::eye ... ok
test linalg::nalgebra_bindings::tests::get_row ... ok
test linalg::nalgebra_bindings::tests::get_row_col_as_vec ... ok
test linalg::nalgebra_bindings::tests::get_set_dynamic ... ok
test linalg::nalgebra_bindings::tests::get_set_vector ... ok
test linalg::nalgebra_bindings::tests::matmul ... ok
test linalg::nalgebra_bindings::tests::max_diff ... ok
test linalg::nalgebra_bindings::tests::min_max_sum ... ok
test linalg::nalgebra_bindings::tests::negative_mut ... ok
test linalg::nalgebra_bindings::tests::norm ... ok
test linalg::nalgebra_bindings::tests::ols_fit_predict ... ok
test linalg::nalgebra_bindings::tests::ones ... ok
test linalg::nalgebra_bindings::tests::pow_mut ... ok
test linalg::nalgebra_bindings::tests::rand ... ok
test linalg::nalgebra_bindings::tests::reshape ... ok
test linalg::nalgebra_bindings::tests::scalar_add_sub_mul_div ... ok
test linalg::nalgebra_bindings::tests::shape ... ok
test linalg::nalgebra_bindings::tests::slice ... ok
test linalg::nalgebra_bindings::tests::softmax_mut ... ok
test linalg::nalgebra_bindings::tests::to_from_row_vector ... ok
test linalg::nalgebra_bindings::tests::transpose ... ok
test linalg::nalgebra_bindings::tests::unique ... ok
test linalg::nalgebra_bindings::tests::vec_approximate_eq ... ok
test linalg::nalgebra_bindings::tests::vec_copy_from ... ok
test linalg::nalgebra_bindings::tests::vec_dot ... ok
test linalg::nalgebra_bindings::tests::vec_init ... ok
test linalg::nalgebra_bindings::tests::vec_len ... ok
test linalg::nalgebra_bindings::tests::vec_to_vec ... ok
test linalg::nalgebra_bindings::tests::vstack_hstack ... ok
test linalg::nalgebra_bindings::tests::zeros ... ok
test linalg::ndarray_bindings::tests::abs_mut ... ok
test linalg::ndarray_bindings::tests::add_element_mut ... ok
test linalg::ndarray_bindings::tests::add_mut ... ok
test linalg::ndarray_bindings::tests::approximate_eq ... ok
test linalg::ndarray_bindings::tests::argmax ... ok
test linalg::ndarray_bindings::tests::col_matrix_to_row_vector ... ok
test linalg::ndarray_bindings::tests::col_mean ... ok
test linalg::ndarray_bindings::tests::copy_from ... ok
test linalg::ndarray_bindings::tests::copy_row_col_as_vec ... ok
test linalg::ndarray_bindings::tests::div_element_mut ... ok
test linalg::ndarray_bindings::tests::div_mut ... ok
test linalg::ndarray_bindings::tests::dot ... ok
test linalg::ndarray_bindings::tests::eye ... ok
test linalg::ndarray_bindings::tests::from_to_row_vec ... ok
test linalg::ndarray_bindings::tests::get_col_as_vector ... ok
test linalg::ndarray_bindings::tests::get_row ... ok
test linalg::ndarray_bindings::tests::get_row_as_vector ... ok
test linalg::ndarray_bindings::tests::get_set ... ok
test linalg::ndarray_bindings::tests::lr_fit_predict_iris ... ok
test linalg::ndarray_bindings::tests::matmul ... ok
test linalg::ndarray_bindings::tests::max_diff ... ok
test linalg::ndarray_bindings::tests::min_max_sum ... ok
test linalg::ndarray_bindings::tests::mul_element_mut ... ok
test linalg::ndarray_bindings::tests::mul_mut ... ok
test ensemble::random_forest_regressor::tests::fit_predict_longley_oob ... ok
test linalg::ndarray_bindings::tests::negative_mut ... ok
test linalg::ndarray_bindings::tests::norm ... ok
test linalg::ndarray_bindings::tests::pow_mut ... ok
test linalg::ndarray_bindings::tests::rand ... ok
test linalg::ndarray_bindings::tests::reshape ... ok
test linalg::ndarray_bindings::tests::scalar_ops ... ok
test linalg::ndarray_bindings::tests::slice ... ok
test linalg::ndarray_bindings::tests::softmax_mut ... ok
test linalg::ndarray_bindings::tests::sub_element_mut ... ok
test linalg::ndarray_bindings::tests::sub_mut ... ok
test linalg::ndarray_bindings::tests::transpose ... ok
test linalg::ndarray_bindings::tests::unique ... ok
test linalg::ndarray_bindings::tests::vec_approximate_eq ... ok
test linalg::ndarray_bindings::tests::vec_copy_from ... ok
test linalg::ndarray_bindings::tests::vec_dot ... ok
test linalg::ndarray_bindings::tests::vec_get_set ... ok
test linalg::ndarray_bindings::tests::vec_len ... ok
test linalg::ndarray_bindings::tests::vec_to_vec ... ok
test linalg::ndarray_bindings::tests::vstack_hstack ... ok
test linalg::qr::tests::decompose ... ok
test linalg::qr::tests::qr_solve_mut ... ok
test linalg::stats::tests::mean ... ok
test linalg::stats::tests::scale ... ok
test linalg::stats::tests::std ... ok
test linalg::stats::tests::var ... ok
test linalg::svd::tests::decompose_asymmetric ... ok
test linalg::svd::tests::decompose_restore ... ok
test linalg::svd::tests::decompose_symmetric ... ok
test linalg::svd::tests::solve ... ok
test linalg::tests::matrix_from_csv::non_existant_input_file ... ok
test linalg::tests::matrix_from_csv::simple_read_default_csv ... ok
test linalg::tests::mean ... ok
test linalg::tests::std ... ok
test linalg::tests::take ... ok
test linalg::tests::take_second_column_from_matrix ... ok
test linalg::tests::var ... ok
test linalg::tests::vec_take ... ok
test linear::bg_solver::tests::bg_solver ... ok
test linear::elastic_net::tests::elasticnet_fit_predict1 ... ok
test linear::elastic_net::tests::elasticnet_longley ... ok
test linear::elastic_net::tests::serde ... ok
test linear::lasso::tests::lasso_fit_predict ... ok
test linear::lasso::tests::serde ... ok
test linear::linear_regression::tests::ols_fit_predict ... ok
test linear::linear_regression::tests::serde ... ok
test linear::logistic_regression::tests::binary_objective_f ... ok
test linear::logistic_regression::tests::lr_fit_predict ... ok
test linear::logistic_regression::tests::lr_fit_predict_binary ... ok
test linear::logistic_regression::tests::lr_fit_predict_iris ... ok
test linear::logistic_regression::tests::lr_fit_predict_multiclass ... ok
test linear::logistic_regression::tests::multiclass_objective_f ... ok
test linear::logistic_regression::tests::serde ... ok
test linear::ridge_regression::tests::ridge_fit_predict ... ok
test linear::ridge_regression::tests::serde ... ok
test math::distance::euclidian::tests::squared_distance ... ok
test math::distance::hamming::tests::hamming_distance ... ok
test math::distance::mahalanobis::tests::mahalanobis_distance ... ok
test math::distance::manhattan::tests::manhattan_distance ... ok
test math::distance::minkowski::tests::minkowski_distance ... ok
test math::distance::minkowski::tests::minkowski_distance_negative_p - should panic ... ok
test math::num::tests::f32_from_string ... ok
test math::num::tests::f[64](https://github.com/smartcorelib/smartcore/actions/runs/3080232980/jobs/4977339110#step:7:65)_from_string ... ok
test math::num::tests::sigmoid ... ok
test math::vector::tests::unique_with_indices ... ok
test metrics::accuracy::tests::accuracy ... ok
test metrics::auc::tests::auc ... ok
test metrics::cluster_hcv::tests::homogeneity_score ... ok
test metrics::cluster_helpers::tests::contingency_matrix_test ... ok
test metrics::cluster_helpers::tests::entropy_test ... ok
test metrics::cluster_helpers::tests::mutual_info_score_test ... ok
test metrics::f1::tests::f1 ... ok
test metrics::mean_absolute_error::tests::mean_absolute_error ... ok
test metrics::mean_squared_error::tests::mean_squared_error ... ok
test metrics::precision::tests::precision ... ok
test metrics::precision::tests::precision_multiclass ... ok
test metrics::r2::tests::r2 ... ok
test metrics::recall::tests::recall ... ok
test metrics::recall::tests::recall_multiclass ... ok
test model_selection::kfold::tests::numpy_parity_test ... ok
test model_selection::kfold::tests::numpy_parity_test_shuffle ... ok
test model_selection::kfold::tests::run_kfold_return_split_simple ... ok
test model_selection::kfold::tests::run_kfold_return_split_simple_shuffle ... ok
test model_selection::kfold::tests::run_kfold_return_test_indices_odd ... ok
test model_selection::kfold::tests::run_kfold_return_test_indices_simple ... ok
test model_selection::kfold::tests::run_kfold_return_test_mask_simple ... ok
test model_selection::tests::run_train_test_split ... ok
test model_selection::tests::test_cross_val_predict_knn ... ok
test model_selection::tests::test_cross_validate_biased ... ok
test model_selection::tests::test_cross_validate_knn ... ok
test naive_bayes::bernoulli::tests::bernoulli_nb_scikit_parity ... ok
test naive_bayes::bernoulli::tests::run_bernoulli_naive_bayes ... ok
test naive_bayes::bernoulli::tests::serde ... ok
test naive_bayes::categorical::tests::run_categorical_naive_bayes ... ok
test naive_bayes::categorical::tests::run_categorical_naive_bayes2 ... ok
test naive_bayes::categorical::tests::serde ... ok
test naive_bayes::gaussian::tests::run_gaussian_naive_bayes ... ok
test naive_bayes::gaussian::tests::run_gaussian_naive_bayes_with_priors ... ok
test naive_bayes::gaussian::tests::serde ... ok
test naive_bayes::multinomial::tests::multinomial_nb_scikit_parity ... ok
test naive_bayes::multinomial::tests::run_multinomial_naive_bayes ... ok
test naive_bayes::multinomial::tests::serde ... ok
test neighbors::knn_classifier::tests::knn_fit_predict ... ok
test neighbors::knn_classifier::tests::knn_fit_predict_weighted ... ok
test neighbors::knn_classifier::tests::serde ... ok
test neighbors::knn_regressor::tests::knn_fit_predict_uniform ... ok
test neighbors::knn_regressor::tests::knn_fit_predict_weighted ... ok
test neighbors::knn_regressor::tests::serde ... ok
test linalg::ndarray_bindings::tests::my_fit_longley_ndarray ... ok
test optimization::first_order::lbfgs::tests::lbfgs ... ok
test optimization::line_search::tests::backtracking ... ok
test preprocessing::categorical::tests::adjust_idxs ... ok
test preprocessing::categorical::tests::fail_on_bad_category ... ok
test preprocessing::categorical::tests::hash_encode_f64_series ... ok
test preprocessing::categorical::tests::matrix_transform_test ... ok
test preprocessing::categorical::tests::test_fit ... ok
test preprocessing::numerical::tests::helper_functionality::combine_three_columns ... ok
test preprocessing::numerical::tests::helper_functionality::negative_value_should_be_replace_with_minimal_positive_value ... ok
test preprocessing::numerical::tests::helper_functionality::zero_should_be_replace_with_minimal_positive_value ... ok
test preprocessing::numerical::tests::standard_scaler::dont_adjust_mean_if_used ... ok
test preprocessing::numerical::tests::standard_scaler::dont_adjust_std_if_used ... ok
test preprocessing::numerical::tests::standard_scaler::fit_for_random_values ... ok
test preprocessing::numerical::tests::standard_scaler::fit_for_simple_values ... ok
test preprocessing::numerical::tests::standard_scaler::fit_transform_random_values ... ok
test preprocessing::numerical::tests::standard_scaler::fit_transform_with_zero_variance ... ok
test preprocessing::numerical::tests::standard_scaler::replace_mean_with_zero_if_not_used ... ok
test preprocessing::numerical::tests::standard_scaler::replace_std_with_one_if_not_used ... ok
test preprocessing::numerical::tests::standard_scaler::serde_fit_for_random_values ... ok
test preprocessing::numerical::tests::standard_scaler::transform_without_mean ... ok
test preprocessing::numerical::tests::standard_scaler::transform_without_std ... ok
test preprocessing::series_encoder::tests::category_map_and_vec ... ok
test preprocessing::series_encoder::tests::from_categories ... ok
test preprocessing::series_encoder::tests::invert_label_test ... ok
test preprocessing::series_encoder::tests::ordinal_encoding ... ok
test preprocessing::series_encoder::tests::positional_categories_vec ... ok
test preprocessing::series_encoder::tests::test_many_categorys ... ok
test readers::csv::tests::detect_row_format::detect_2_fields_with_header ... ok
test readers::csv::tests::detect_row_format::detect_3_fields_no_header ... ok
test readers::csv::tests::detect_row_format::detect_no_rows_provided ... ok
test readers::csv::tests::extract_fields_from_csv_row::read_four_values_from_csv_row ... ok
test readers::csv::tests::extract_row_vectors_from_csv_text::read_default_csv ... ok
test readers::csv::tests::extract_value_from_csv_field::cant_deserialize_f64_from_string ... ok
test readers::csv::tests::extract_value_from_csv_field::deserialize_f32_from_non_floating_point ... ok
test readers::csv::tests::extract_value_from_csv_field::deserialize_f64_from_floating_point ... ok
test readers::csv::tests::extract_value_from_csv_field::deserialize_f64_from_negative_floating_point ... ok
test readers::csv::tests::extract_value_from_csv_field::deserialize_f64_from_non_floating_point ... ok
test readers::csv::tests::extract_vector_from_csv_line::cannot_extract_second_value ... ok
test readers::csv::tests::extract_vector_from_csv_line::extract_five_floating_point_values ... ok
test readers::csv::tests::matrix_from_csv_source::different_number_of_columns ... ok
test readers::csv::tests::matrix_from_csv_source::error_in_colum_1_row_1 ... ok
test readers::csv::tests::matrix_from_csv_source::read_csv_semicolon_as_seperator ... ok
test readers::csv::tests::matrix_from_csv_source::read_simple_csv ... ok
test readers::csv::tests::matrix_from_csv_source::read_simple_string ... ok
test readers::csv::tests::test_validate_csv_row::invalid_number_of_fields ... ok
test readers::csv::tests::test_validate_csv_row::valid_row_with_comma ... ok
test readers::csv::tests::test_validate_csv_row::valid_row_with_semicolon ... ok
test readers::error::tests::extract_message_from_reading_error ... ok
test readers::error::tests::reading_error_from_io_error ... ok
test readers::io_testing::test::read_from_testing_data_source ... ok
test readers::io_testing::test::test_string_to_file ... ok
test readers::io_testing::test::test_temporary_text_file ... ok
test svm::svc::tests::svc_fit_decision_function ... ok
test svm::svc::tests::svc_fit_predict ... FAILED
test optimization::first_order::gradient_descent::tests::gradient_descent ... ok
test svm::svc::tests::svc_fit_predict_rbf ... ok
test svm::svr::tests::svr_fit_predict ... ok
test svm::svc::tests::svc_serde ... ok
test svm::tests::linear_kernel ... ok
test svm::tests::polynomial_kernel ... ok
test svm::tests::rbf_kernel ... ok
test svm::tests::sigmoid_kernel ... ok
test tree::decision_tree_classifier::tests::fit_predict_baloons ... ok
test tree::decision_tree_classifier::tests::fit_predict_iris ... ok
test tree::decision_tree_classifier::tests::gini_impurity ... ok
test tree::decision_tree_classifier::tests::serde ... ok
test tree::decision_tree_regressor::tests::fit_longley ... ok
test tree::decision_tree_regressor::tests::serde ... ok
test svm::svr::tests::svr_serde ... ok

failures:

---- svm::svc::tests::svc_fit_predict stdout ----
thread 'svm::svc::tests::svc_fit_predict' panicked at 'assertion failed: accuracy(&y_hat, &y) >= 0.9', src\svm\svc.rs:[78](https://github.com/smartcorelib/smartcore/actions/runs/3080232980/jobs/4977339110#step:7:79)0:9

failures:
    svm::svc::tests::svc_fit_predict

test result: FAILED. 322 passed; 1 failed; 5 ignored; 0 measured; 0 filtered out; finished in 4.[81](https://github.com/smartcorelib/smartcore/actions/runs/3080232980/jobs/4977339110#step:7:82)s

error: test failed, to rerun pass '--lib'
Mec-iS commented 2 years ago

this is the same problem in #157

morenol commented 2 years ago

As per https://github.com/smartcorelib/smartcore/actions/runs/3091695574/jobs/5002085534

It looks like sometimes the accuracy is 0.8

Mec-iS commented 2 years ago

yep, @montanalow provided a test with a dataset in #157 that is supposed to provide a higher accuracy and fails.

montanalow commented 2 years ago

I think I was confused by a test failure here that I saw locally while developing, but didn’t repro when I opened the PR to demonstrate (which makes sense if it’s a flaky test). I’m not sure that the accuracy should actually be higher, so much as reproducible consistently. Can we eliminate all random variables with fixed seeds for tests?