SN-HackDay / Advancing-discovery-with-research-data

Springer Nature hack day in London 29th November
5 stars 3 forks source link

Hack idea: automatically classify datasets by subject area #9

Open mfenner opened 6 years ago

mfenner commented 6 years ago

For researchers interested in data from a particular domain, it is hard to find this data if there is no dedicated repository, or if there are many different repositories. Subject area classification by authors or data centers is not standardized, and does not scale well.

Proposal: automatically classify datasets based on title and abstract, using a simple and generic classification such as the OECD Fields of Science and Technology and a tool such as https://github.com/inspirehep/magpie.