alicia-ziying-yang / conTEXT-explorer

ConTEXT Explorer is an open Web-based system for exploring and visualizing concepts (combinations of occurring words and phrases) over time in the text documents.
Apache License 2.0
9 stars 3 forks source link

Analysis is case-sensitive #6

Closed baileythegreen closed 3 years ago

baileythegreen commented 3 years ago

Analyses are case-sensitive and this is not made clear in the README or in the paper. I attach a screenshot demonstrating this with the first five chapters of Pride & Prejudice and the keywords: Elizabeth, Bennet, Fitzwilliam, and Darcy—both capitalised and uncapitalised. You can see the results for the Uppercase set do not match those for the Lowercase set.

I consider this something that needs to be—at a minimum—clarified before the paper is published. Really, I think this aspect of the behaviour should be altered so that it is not case-sensitive; but perhaps there is a good reason to keep it—with clarification.

JOSS Reference: openjournals/joss-reviews#3347

Screenshot of ConTEXT-explorer dashboard. Some of the results of an analysis of the first 5 chapters of Pride & Prejudice are shown; The point of the image is to show that the results are case-sensitive: using capitalised versions of the names shows there are no sentences that match; using lowercase versions of the results shows there are matching sentences. This is perhaps not what would be expected.
alicia-ziying-yang commented 3 years ago

Hi @baileythegreen , Thank you so much for finding this issue (actually it's a bug). I have fixed that, so now for example "marriage" and "Marriage" should be equal in the search process. That is, it is not case-sensitive search anymore.