computationalstylistics / stylo

R package for stylometric analyses
171 stars 46 forks source link

stylo.rolling_delta() analysis issue - unable to run on more than 12 documents and unable to customize colors/shapes with gui=True #54

Open corvusMidnight opened 1 year ago

corvusMidnight commented 1 year ago

When using the stylo() package in R, we've encountered issues with the stylo.rolling_delta() analysis function. Specifically, it is not possible to run this analysis on more than 12 documents: when the attribute gui=True is used, the user interface does not allow the user to customize the colors or shapes used in the output graph beyond those that are provided in the interface to fit more 12 documents.

We have attempted to find a way to customize the colors and shapes used in the graph, even when not using the GUI, but have not been successful.

These issues are hindering the usability of the package, and we would appreciate any guidance or assistance in resolving them.

Steps to reproduce:

Install the stylo() package in R Use stylo.rolling_delta() analysis function on more than 12 documents Try to customize colors or shapes in output graph when gui=True or not using GUI

perechen commented 1 year ago

I'm not aware of any limitation, but (if I remember correctly) you had a setup where each document formed its own class. Having more than 12 classes, indeed, might set things off, since it is rarely practical to do a sequential classification on that many. Can you maybe provide the console output when you run >12 docs? Also, a screenshot of the training set corpus would be helpful.

corvusMidnight commented 1 year ago

This is what my folder looks like:

issue

This is the output graph:

Good_Omens_NOVEL001

The console output does not present any irregularities.

perechen commented 1 year ago

Thanks! I see the legend has all documents; the problem is that the lines on plot correspond only to 12? It might be some color assignment weirdness (and data is actually there). Btw, rolling_classify() should treat documents by the default stylo procedure, which is treating the string before the first underscore as a class.