openml / OpenML

Open Machine Learning
https://openml.org
BSD 3-Clause "New" or "Revised" License
668 stars 91 forks source link

redundant estimation procedures in DB #1216

Open degrave opened 3 months ago

degrave commented 3 months ago

In data/sql/estimation_procedure.sql: (5, 1, '10% Holdout set', 'holdout', 1, NULL, 'false', 33, 'true', 'false', '2014-12-31 21:00:00'), (6, 1, '33% Holdout set', 'holdout', 1, NULL, 'false', 33, 'true', 'false', '2014-12-31 21:00:00'),

Surely the first 33 ought to be 10?

These records (and possibly others) seem redundant (and the 10% case has also a wrong percentage field): (27, 2, 'Test on Training Data', 'testontrainingdata', NULL, NULL, 'false', NULL, NULL, 'false', '2019-03-16 11:30:14'), (29, 9, '10-fold Crossvalidation', 'crossvalidation', 1, 10, 'false', NULL, 'true', 'false', '2014-12-31 20:00:00'), (30, 10, '10-fold Crossvalidation', 'crossvalidation', 1, 10, 'false', NULL, 'true', 'false', '2023-02-22 11:46:54'), (31, 10, '5 times 2-fold Crossvalidation', 'crossvalidation', 5, 2, 'false', NULL, 'true', 'false', '2023-02-22 11:46:54'), (32, 10, '10 times 10-fold Crossvalidation', 'crossvalidation', 10, 10, 'false', NULL, 'true', 'false', '2023-02-22 11:46:54'), (33, 10, '10% Holdout set', 'holdout', 1, NULL, 'false', 33, 'true', 'false', '2023-02-22 11:46:54'), (34, 10, '33% Holdout set', 'holdout', 1, NULL, 'false', 33, 'true', 'false', '2023-02-22 11:46:54'), (35, 11, '33% Holdout set', 'holdout', 1, NULL, 'false', 33, 'true', 'false', '2023-06-15 16:34:54');

Also, dead link in https://github.com/openml/OpenML/blob/d2f1cab55d6bea95e9067c1a54af3a2b04b966ae/CONTRIBUTING.md