ML Data Prep Zoo
A zoo of labelled datasets and ML models for data prep tasks. Please refer to our paper for more details.
Task 1 (t1): ML Feature Type Inference (Multi-class classification)
Task 2 (t2): Category Deduplication (Binary classification)
Task 3 (t3): Embedded Number Extraction (Sequence-to-sequence learning)
Task 4 (t4): Detect Anomalous Categories (Binary classification)
Task 5 (t5): Multiple Number Units Detection (Binary classification)
Task 6 (t6): List Domain Extraction (Sequence-to-set-of-sequence learning)