For researchers interested in data from a particular domain, it is hard to find this data if there is no dedicated repository, or if there are many different repositories. Subject area classification by authors or data centers is not standardized, and does not scale well.
For researchers interested in data from a particular domain, it is hard to find this data if there is no dedicated repository, or if there are many different repositories. Subject area classification by authors or data centers is not standardized, and does not scale well.
Proposal: automatically classify datasets based on
title
andabstract
, using a simple and generic classification such as the OECD Fields of Science and Technology and a tool such as https://github.com/inspirehep/magpie.