chembl / GLaDOS

Web Interface for ChEMBL @ EMBL-EBI
https://www.ebi.ac.uk/chembl/
Other
46 stars 6 forks source link

Document report card - word cloud #343

Closed agaulton closed 5 years ago

agaulton commented 7 years ago

A lot of these don't seem too informative, with only one term (and often doesn't really seem the most relevant one) e.g., https://chembl-glados.herokuapp.com/document_report_card/CHEMBL1177698/

Sometimes clicking on the term retrieves only the document you started with.

Also should probably not show that section if the cloud is empty e.g., https://chembl-glados.herokuapp.com/document_report_card/CHEMBL1201862/

Not clear the relationship with the Related Documents section (which currently has fake data) - is it the same data?

mnowotka commented 7 years ago

It's not the same data. This one is implemented using textrank algorithm that extracts most relevan keywords from abstracts. Then the network of documents sharing common keywords is created. Another way would be to use LSA/LSI algorithms from gensim library - we started discussing this but never got time to have a proper look.

nclopezo commented 5 years ago

Closing because of https://github.com/chembl/GLaDOS/issues/1073#issuecomment-479805852