Cannot handle supervised tasks with estimation type as "Test on Training Data"

Neeratyoy commented 3 years ago

Description

Certain tasks exist on the OpenML server with invalid or illegal task estimation type that results in empty task estimation parameters.

Steps/Code to Reproduce

import openml
from sklearn.svm import SVC

openml.config.start_using_configuration_for_example()

clf = SVC()
task = openml.tasks.get_task(1202)
run = openml.runs.run_model_on_task(model=clf, task=task)

run.publish()  # fails

Expected results

publish() should work successfully.

Misc. info

The list of task IDs on the production server with the estimation_procedure type as testontrainingdata:

# len(prod_tids) = 7
prod_tids = [168867, 189921, 211700, 211982, 233159, 233160, 317601, 317611, 360876]

The list of task IDs on the test server with the estimation_procedure type as testontrainingdata:

# len(test_tids) = 102
test_tids = [1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1202]

On both servers, no run exists for these list of tasks.

PGijsbers commented 3 years ago

What is the produced error? The fact that no such runs exist on the server points towards a server error rather than an openml-python error. Can you confirm which end is responsible for the failed upload?

Neeratyoy commented 3 years ago

What is the produced error?

OpenMLServerException: https://test.openml.org/api/v1/xml/run/ returned code 216: Error processing output data: illegal combination of evaluation measure attributes (repeat, fold, sample) - Measure(s): usercpu_time_millis_training(0, 0), wall_clock_time_millis_training(0, 0), usercpu_time_millis_testing(0, 0), usercpu_time_millis(0, 0), wall_clock_time_millis_testing(0, 0), wall_clock_time_millis(0, 0), predictive_accuracy(0, 0)

I too think it is a server-side error and posted this issue to confirm the same.

However, I am not sure if the predictions we generate locally are as the server expects since this task type is not something common. The test split contains the full dataset.

@mfeurer

Neeratyoy commented 3 years ago

Hey @janvanrijn, any insight on this would be nice!

openml / openml-python