datasciencecampus / pygrams

Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence
https://datasciencecampus.github.io/pygrams
Other
63 stars 23 forks source link

No graph output #369

Closed davidlenz closed 4 years ago

davidlenz commented 4 years ago

Describe the bug A clear and concise description of what the bug is.

When running the command

python pygrams.py -dh publication_date -uc=out-mdf-0.05-200501-201822 -o graph

i would expect a graph html-output in the graph folder, however there is none. Works well with all other outputs, e.g. wordcloud and multiplot. No error thrown or anything.

To Reproduce Steps to reproduce the behavior:

  1. Create tf-idf matrices: python pygrams.py -dh publication_date -ds=USPTO-random-100000.pkl.bz2
  2. python pygrams.py -dh publication_date -uc=out-mdf-0.05-200501-201822 -o graph
  3. check in outputs folder out-mdf-0.05-200501-201822
  4. See no html file

Expected behavior Have .html graph file in graph folder

Screenshots grafik

Desktop (please complete the following information):

Additional context

mshodge commented 4 years ago

Investigating using fresh repo and environment. Will report back.

mshodge commented 4 years ago

Hi @davidlenz unfortunately this is old code that we did not remove. We used to use NetworkX's Force Directed Graph (FDG) but there were issues with combining the D3/JS element of this with our code. As a result, we stopped this feature. I will remove reference to it from documents and arguments. Sorry if you thought this feature was useful.

I will make a ticket to replace this feature with something more Python friendly. Perhaps an interactive chord diagram instead, which shows the links between terms, albeit not with the force directed component.