The ontology should, perhaps, include high-level concepts of data science, such as "data cleaning/preprocessing", "inference", and "evaluation". The usefulness of such concepts is obvious, but there are several difficulties. Unlike the concepts currently in the ontology, these high-level concepts are
informal and imprecise, i.e., do not admit a clean mathematical description
usually present only implicitly in code or natural text, i.e., must be either inferred using NLP methods or manually annotated by the data analysis author
The ontology should, perhaps, include high-level concepts of data science, such as "data cleaning/preprocessing", "inference", and "evaluation". The usefulness of such concepts is obvious, but there are several difficulties. Unlike the concepts currently in the ontology, these high-level concepts are
How to proceed is an open question.