zme1 / toscana

A repository to house research and web development for the Lega Toscana project, led by professor Lina Insana (Spring 2018) and professor Lorraine Denman (Fall 2018), and with consultation from members of the DH Advanced Praxis group at the University of Pittsburgh at Greensburg.
http://toscana.newtfire.org
3 stars 1 forks source link

Visualization on Lemmatized Forms #53

Closed zme1 closed 5 years ago

zme1 commented 5 years ago

@ebeshero I used XQuery (in my last commit) to tease out all the different lemmatized forms and plan to draft some sort of visualization to accompany how they're used over time. All in all, there are 57, of which only 21 are used more than once. I was browsing different examples of dot matrix charts and thought that they could serve as a useful visualization for this type of information in particular. I have one preoccupation with them, though. It seems as though my data is too varied to allow effective understanding of the visualization. That is, in the examples I've seen, dot matrix charts usually have between 4 and 6 different "categories." Since my categories would be the lemmas, that leaves me with over 20 different categories to account for, assuming I ignore the isolated uses of words (or simply group them together).

If I narrow my visualization to specify only terms that are mentioned, say, four times throughout the volume, I'm still left with 13 lemmas. I could, theoretically, topicalize the forms, but at that point the most interesting component of the visualization itself is seriously diluted. Do you have any thoughts on this? Does a dot matrix chart seem like a potentially confusing and difficult visualization to interpret with this sort of data?

Edit: At any point not included in the portion of the minutes believed to be a transcribed invoice, there are fifteen terms used more than once at any other point in the volume.