Open adelmemariani opened 2 years ago
In principle, if the institutions provide enough textual descriptions for the column names of their datasets (and also the distinct values in the categorical columns), then it is possible to automatically recommend to them some candidate OEO concepts: A recommender engine (inside the OEP) for the alignments between the datasets' columns and the OEO terms. I already made a prototype, however, the performance of the final product depends on the quality of the descriptions.
As an example, this scenario dataset has meta information and in the 'resources' -> 'fileds' there are some descriptions about the columns. However, typically, these descriptions are not sufficient for inferring an OEO concept for the column.
As of 'May 2nd, 2022', there are 238 unique column names in total (all uploaded scenario datasets). Some of them can be mapped easily to the OEO concepts: