Closed agaulton closed 5 years ago
It's not the same data. This one is implemented using textrank algorithm that extracts most relevan keywords from abstracts. Then the network of documents sharing common keywords is created. Another way would be to use LSA/LSI algorithms from gensim library - we started discussing this but never got time to have a proper look.
Closing because of https://github.com/chembl/GLaDOS/issues/1073#issuecomment-479805852
A lot of these don't seem too informative, with only one term (and often doesn't really seem the most relevant one) e.g., https://chembl-glados.herokuapp.com/document_report_card/CHEMBL1177698/
Sometimes clicking on the term retrieves only the document you started with.
Also should probably not show that section if the cloud is empty e.g., https://chembl-glados.herokuapp.com/document_report_card/CHEMBL1201862/
Not clear the relationship with the Related Documents section (which currently has fake data) - is it the same data?