Since we have a list of authors of each publication for both datasets, we can try to link those authors to their profiles on Wikidata and explore the information there to obtain a potential list of topics. This list of topics could then be combined with the ones obtained from the topic extraction models to return the final list of candidates.
There could be some noise if we do this (although an author has been working on a specific field at some point in his career, that doesn't mean that the publication we are currently analysing belongs to that field), so we also need to keep that in mind.
Original proposal by @labra
Since we have a list of authors of each publication for both datasets, we can try to link those authors to their profiles on Wikidata and explore the information there to obtain a potential list of topics. This list of topics could then be combined with the ones obtained from the topic extraction models to return the final list of candidates.
There could be some noise if we do this (although an author has been working on a specific field at some point in his career, that doesn't mean that the publication we are currently analysing belongs to that field), so we also need to keep that in mind.