Starting point is the table keyword_wikidata where you can find a mapping of the networks keywords to the wikidata with its parents over 10 iterations. This is a 1:n mapping, since a keyowrd may appear multiple times in the wikidata.
Write a script that inputs predefined categories and then iterates over all the mappings in keyword_wikidata. It then checks if it finds the category names in the parents list of each keyword. If it matches then it will add this keyword category relation to a new table called keyword_categories. A keyword in the network can be part of several categories. If a keyword in the network does not match with any of the provided categories, add it to the artificial category "Others". This is a temporary fix and will be addressed in a later issue.
The category list may change in the future but is for now:
[biomolecule | chemical substance | metal | process | analytical method | biochemical relation | property]
Goal of this issue is to have a clustered first dataset for visualization of the networks. To achieve that one must do the following steps.
The category list may change in the future but is for now: [biomolecule | chemical substance | metal | process | analytical method | biochemical relation | property]