DARIAH-DE / TopicsExplorer

Explore your own text collection with a topic model – without prior knowledge.
https://dariah-de.github.io/TopicsExplorer
Apache License 2.0
62 stars 10 forks source link

Row labels off in document-topics heatmap #86

Open frederik-elwert opened 5 years ago

frederik-elwert commented 5 years ago

I noticed that in my test corpus, the row (document) labels are off in the document-topics heatmap visualisation: The labels start a few lines below the top, and at the bottom they “overshoot” the matrix. The tooltip shows that the row refers to a different document.

topic_explorer_heatmap

severinsimmler commented 5 years ago

Thank you for the bug report, this is interesting! In this case you can only rely on the tooltip information.

How extensive is your corpus? The library we use to create the heatmap, ApexCharts.js, has only been implemented recently, so we haven't tested it much yet. Apparently the filenames are quite long (and partly cropped?), I suspect it could be that. However, we will try to reproduce this behavior.

frederik-elwert commented 5 years ago

The corpus contains almost 2000 documents.

The longest file name is 86 characters long:

2023_muslim-women-who-suffer-family-violence-in-singapore-can-seek-recourse-%c2%ad.txt

(I guess I could shorten the file names, but they were produced automatically, so I didn’t bother.)

MHuberFaust commented 5 years ago

same problem with around 200 texts. Although it is only off by 2 rows.