epfl-dlab / WikiPDA

Crosslingual Topic Modeling with WikiPDA
10 stars 4 forks source link

Multi-label topic modeling #1

Open HariWu1995 opened 3 years ago

HariWu1995 commented 3 years ago

Can I use this approach for multi-label topic modeling? For example, if I have a dataset, in which a text sample can have 0, 1 or many labels / topics, can this source code work well?

tizianopiccardi commented 3 years ago

Hi, WikiPDA gives you vectors with the topic distribution for Wikipedia articles (or snippets of text). You can use these vectors to train any other supervised model, as we show in the paper for the ORES use case. This setup is described in section 5.2 of the paper: https://arxiv.org/abs/2009.11207