BioinfoNet / Data-mining

Data mining to discover trends in Open Science in Kenya
5 stars 14 forks source link

Open questions #19

Open Shuyib opened 4 years ago

Shuyib commented 4 years ago

Hi,

I've just seen that the repo has been updated to reflect changes till now. I have several questions on the way forward.

What do you think? @kipkurui

Shuyib commented 4 years ago

I'll do it anyway. When I get the time.

kipkurui commented 4 years ago

Hi @Shuyib that is a wonderful Idea. It would be fun to have a widget that would provide real-time visualisation. Is that what you had in mind. For natural language processing, which kind of question can we prioritise?

Shuyib commented 4 years ago

Yes, we'll need to wrap everything in a while loop at the querying and returning the results + timing in between. I think it would be better to make one version where the user can just query what they want. I can wrap that with ipywidgets.

For NLP, I'll use Non-negative matrix factorization which is an unsupervised learning method. We can adjust the K that is, the number of clusters to potentially find out the topics in the abstracts only. Unfortunately. Just a by the way, we could use this in the next session mybinder

How does that sound?