ronan-mch / lobbying

data and analysis from the Irish Lobbying Register
1 stars 0 forks source link

Topic-Level analysis #2

Open daramcq opened 6 years ago

daramcq commented 6 years ago

How do we do analysis of meetings on topics? Would be great to see who met who on various topics - did an industry body have more access than an NGO or campaigning group?

Is SQL currently the best way of doing this?

daramcq commented 6 years ago

Can we do something like keywords to allow tag-based navigation?

daramcq commented 6 years ago

This may be the first valid use case for Semantic Web technologies I've ever seen :p

ronan-mch commented 6 years ago

Hmmm.... Well I don't think SQL is the tool for it. I think it would work something like this:

id, text, from db |> remove stopwords |> topic extraction |> new data structure (reverse index?)

I'm not sure what the best way to do topic extraction would be, Named Entity Recognition? Latent Dirichlet Allocation? LDA is probably superior.

Once we have the data structure, we could put a html / js frontend on it to allow for browsing.