Visualizations - Githubissues

zme1 commented 6 years ago

I just pushed a new file to my repository with a list of the visualizations I want most to make for the project. It's the plain text file called visualization_ideas.rtf. I think a big problem I'm seeing already is that I am having trouble coping with the huge raw numbers that I have when looking at these minutes. For instance, there are 51 Lega members who received sick comp at some point in those 7 years. I could do a year-by-year analysis, but then I lose the aggregation of the data, which makes it much less interesting to me. I also am generally struggling in conceptualizing how to present these data sets, although I have my best ideas listed with their respective forms of data in the brainstorm sheet. @ebeshero Do these ideas for data sets seem interesting? Do they seem feasible? Are my ideas of visualizing them on the right track? As a person who has not generated more than 3 or 4 different types of visualizations for very simple data sets, I want to try my best to make attractive and more insightful, diverse graphs...

djbpitt commented 6 years ago

@zme1 These are good questions! But before anything else, though, RTF isn't plain text, and if you view the file on line, you see the raw RTF coding, which makes it hard to read. I’d suggest replacing it with either real plain text or Markdown.

With that out of the way, I also often find it challenging to choose the right visualization. I don’t have specific suggestions at the moment (although I’d be happy to look at the data with you, if you think that would be helpful), but when I’m facing that sort of challenge, I often start by reviewing the tutorials and browsing the chart and graph galleries at the links at http://dh.obdurodon.org/#visualization, thinking about how well each option fits the data. My other starting point is to ask not just what the data look like, but also what story I want to tell about them through visualization. Let me know if you’d like to meet to look at some of the specific data sets together.

zme1 commented 6 years ago

@djbpitt I just converted the file to plain text. Sorry about that. I forgot how the .rtf files were rendered in GitHub...

If you would be available to meet at some point to review my data with me, that'd be great. In the meantime, I'll try to explore my data a more and try and tease out a more complete idea as to what kind of portrait they'll paint for the user...

djbpitt commented 6 years ago

@zme1 I have to be on campus for Skype interviews all morning tomorrow (Thu, 03-15). How about if we meet sometime in the afternoon? If you need more time to think about the data and the visualizations before we meet, that’s fine too, of course, but if you’ll be ready tomorrow, I’ll be on campus anyway.

zme1 commented 6 years ago

@djbpitt That's fine with me! I am totally free between 12:50 and 2:30 tomorrow afternoon, if that time window works for you. I'll do some soul-searching tonight with my graphs.

djbpitt commented 6 years ago

@zme1 Let’s try 1:00 in my office. See you then!

ebeshero commented 6 years ago

@djbpitt Thanks for scheduling the data viz counseling session tomorrow! @zme1 , I wish I could join you both! Meanwhile I'm thinking about your visualization text file, and here are some thoughts:

Stacked bars can become stacked shapes: squares for instance, disembodied from drawn axes to the extent that the complete object is supposed to represent a whole. You might just play a little here. (This is something I've been meaning to experiment with in some old projects of mine...)
The flow of members into and out of the Lega is interesting to conceptualize visually! What if you tried drawing that by hand? Is "Lega" a simple shape? (how about a square or a circle?) Would you like to show, over the seven year period of your data, a proportional representation of who's joining along with who's leaving for each year? I'd want to first determine the total number of members in each year. Then find out what proportion of that total is new, and what proportion is on its way out (= not in the organization next year). This is mathematically interesting (to me anyway!) because your total number keeps shifting. Each year you can expect a different proportion of "newbies" and "exiles". You've probably heard me or @djbpitt complain about pie graphs, but those are bad when they have more than 3 or 4 different kinds of things to compare. Here, I imagine a simple geometric shape that you might subdivide to show its internal composition of new vs. ongoing vs. outgoing. That's one way, but there may be others...
But the "flow" in and "flow" out suggests arrows in and out--the edges of a network graph, perhaps, with graph plotted for each year: Who's joining and who's leaving in each year? You could plot this as a whole, but it's also common in network graphs to generate sub-networks. If you imagine edge-connectors as years that people share in an organization together, that's something you can plot as a network. In Cytoscape, you can then select all the edges for a particular year, and the nodes connected with them, and output that as a selection from the larger network graph. (I think I was showing you this last week--and we can go over it...I'd love to see how this looks in distinct yearly networks.)

ebeshero commented 6 years ago

@zme1 Think about how your markup can help you identify the "newbies" and the "exiles". Once you can do that, you can calculate with it (for those geometric shapes) and output columns of "node attributes" for a network, for example. Can you tell a newbie by their absence from earlier minutes? How can you recognize someone as outgoing? Do you have anyone returning after a long absence? Is this information XPath-able?

ebeshero commented 6 years ago

@zme1 One more thought about the flow of new members into or out of the Lega, and it's not a network necessarily (or it's a really simple one). Remember @setriplette 's graph she was showing us last night, where she collapsed the male and female speakers in her plays into just two nodes, "m" vs. "f", and showed all the emotion language shared between them? What if you plotted simply "newbies", "exiles", and "ongoing" people as three different shapes, sized proportionally with one another as they are per year? You might experiment: As Lega gets bigger in certain years, the shapes might be proportionally larger than other years.

Returning to networks a moment (because I've been plotting them all day with my students here): think about what's shared among individuals in the Toscana:

time together in the Lega (simplest relationship)
proposals lobbied and supported (what we discussed last week)
other kinds of experiences (?) --attendance in meetings, vs. absent from meetings --giving / receiving benefits (= sick compensation, etc)

zme1 commented 6 years ago

@ebeshero The one huge caveat of using a network to visualize the proportions of members in the three classes that you described is that there was no manifest or Lega register that I used as a reference... The only information I have on who was actually in the Lega during this time period was those whose names appeared in the minute logs for one reason or another... Another issue with it is that members do take leaves of absence, but those are rarely noted explicitly in the notes (and the times they are cited in those logs, they are actually just classified with those who are actually leaving indefinitely). There is no information on members who were in the Lega but who don't appear in the minutes (of which I suspect there are probably many). That said, I think the most viable data that can be mined from this given information is simply a juxtaposition of administrations to see under whom members came or went.

I do think, however, that maybe there is a chance to represent compensation to members, in a way that resembles @setriplette 's graph that we saw yesterday. The target nodes' sizes could potentially represent net money given to a member, while the edges represent the number of different times a specific member received compensation. In the spirit of my desire to have at least 2 or 3 network visualizations, this could be a cool exploration of the data.

ebeshero commented 6 years ago

@zme1 Ah, I see the problem, then, with representing proportions of totalities--you can't see the whole, but you can see active members particularly. This makes me think that networking individuals may be a way of showing particles drifting into and out of activity over distinct years...You can at least determine which dates in the minutes are associated with specific people you've identified.

zme1 commented 6 years ago

@ebeshero Yes, that data is very easily accessible to me. I'll fiddle around with using networks to visualize this data, but only after I can make my first successful network on the proposals to prove to myself that I can depend on Cytoscape......

zme1 / toscana

Visualizations #15