Closed zme1 closed 6 years ago
@zme1 These are good questions! But before anything else, though, RTF isn't plain text, and if you view the file on line, you see the raw RTF coding, which makes it hard to read. I’d suggest replacing it with either real plain text or Markdown.
With that out of the way, I also often find it challenging to choose the right visualization. I don’t have specific suggestions at the moment (although I’d be happy to look at the data with you, if you think that would be helpful), but when I’m facing that sort of challenge, I often start by reviewing the tutorials and browsing the chart and graph galleries at the links at http://dh.obdurodon.org/#visualization, thinking about how well each option fits the data. My other starting point is to ask not just what the data look like, but also what story I want to tell about them through visualization. Let me know if you’d like to meet to look at some of the specific data sets together.
@djbpitt I just converted the file to plain text. Sorry about that. I forgot how the .rtf
files were rendered in GitHub...
If you would be available to meet at some point to review my data with me, that'd be great. In the meantime, I'll try to explore my data a more and try and tease out a more complete idea as to what kind of portrait they'll paint for the user...
@zme1 I have to be on campus for Skype interviews all morning tomorrow (Thu, 03-15). How about if we meet sometime in the afternoon? If you need more time to think about the data and the visualizations before we meet, that’s fine too, of course, but if you’ll be ready tomorrow, I’ll be on campus anyway.
@djbpitt That's fine with me! I am totally free between 12:50 and 2:30 tomorrow afternoon, if that time window works for you. I'll do some soul-searching tonight with my graphs.
@zme1 Let’s try 1:00 in my office. See you then!
@djbpitt Thanks for scheduling the data viz counseling session tomorrow! @zme1 , I wish I could join you both! Meanwhile I'm thinking about your visualization text file, and here are some thoughts:
Stacked bars can become stacked shapes: squares for instance, disembodied from drawn axes to the extent that the complete object is supposed to represent a whole. You might just play a little here. (This is something I've been meaning to experiment with in some old projects of mine...)
The flow of members into and out of the Lega is interesting to conceptualize visually! What if you tried drawing that by hand? Is "Lega" a simple shape? (how about a square or a circle?) Would you like to show, over the seven year period of your data, a proportional representation of who's joining along with who's leaving for each year? I'd want to first determine the total number of members in each year. Then find out what proportion of that total is new, and what proportion is on its way out (= not in the organization next year). This is mathematically interesting (to me anyway!) because your total number keeps shifting. Each year you can expect a different proportion of "newbies" and "exiles". You've probably heard me or @djbpitt complain about pie graphs, but those are bad when they have more than 3 or 4 different kinds of things to compare. Here, I imagine a simple geometric shape that you might subdivide to show its internal composition of new vs. ongoing vs. outgoing. That's one way, but there may be others...
But the "flow" in and "flow" out suggests arrows in and out--the edges of a network graph, perhaps, with graph plotted for each year: Who's joining and who's leaving in each year? You could plot this as a whole, but it's also common in network graphs to generate sub-networks. If you imagine edge-connectors as years that people share in an organization together, that's something you can plot as a network. In Cytoscape, you can then select all the edges for a particular year, and the nodes connected with them, and output that as a selection from the larger network graph. (I think I was showing you this last week--and we can go over it...I'd love to see how this looks in distinct yearly networks.)
@zme1 Think about how your markup can help you identify the "newbies" and the "exiles". Once you can do that, you can calculate with it (for those geometric shapes) and output columns of "node attributes" for a network, for example. Can you tell a newbie by their absence from earlier minutes? How can you recognize someone as outgoing? Do you have anyone returning after a long absence? Is this information XPath-able?
@zme1 One more thought about the flow of new members into or out of the Lega, and it's not a network necessarily (or it's a really simple one). Remember @setriplette 's graph she was showing us last night, where she collapsed the male and female speakers in her plays into just two nodes, "m" vs. "f", and showed all the emotion language shared between them? What if you plotted simply "newbies", "exiles", and "ongoing" people as three different shapes, sized proportionally with one another as they are per year? You might experiment: As Lega gets bigger in certain years, the shapes might be proportionally larger than other years.
Returning to networks a moment (because I've been plotting them all day with my students here): think about what's shared among individuals in the Toscana:
@ebeshero The one huge caveat of using a network to visualize the proportions of members in the three classes that you described is that there was no manifest or Lega register that I used as a reference... The only information I have on who was actually in the Lega during this time period was those whose names appeared in the minute logs for one reason or another... Another issue with it is that members do take leaves of absence, but those are rarely noted explicitly in the notes (and the times they are cited in those logs, they are actually just classified with those who are actually leaving indefinitely). There is no information on members who were in the Lega but who don't appear in the minutes (of which I suspect there are probably many). That said, I think the most viable data that can be mined from this given information is simply a juxtaposition of administrations to see under whom members came or went.
I do think, however, that maybe there is a chance to represent compensation to members, in a way that resembles @setriplette 's graph that we saw yesterday. The target nodes' sizes could potentially represent net money given to a member, while the edges represent the number of different times a specific member received compensation. In the spirit of my desire to have at least 2 or 3 network visualizations, this could be a cool exploration of the data.
@zme1 Ah, I see the problem, then, with representing proportions of totalities--you can't see the whole, but you can see active members particularly. This makes me think that networking individuals may be a way of showing particles drifting into and out of activity over distinct years...You can at least determine which dates in the minutes are associated with specific people you've identified.
@ebeshero Yes, that data is very easily accessible to me. I'll fiddle around with using networks to visualize this data, but only after I can make my first successful network on the proposals to prove to myself that I can depend on Cytoscape......
I just pushed a new file to my repository with a list of the visualizations I want most to make for the project. It's the plain text file called
visualization_ideas.rtf
. I think a big problem I'm seeing already is that I am having trouble coping with the huge raw numbers that I have when looking at these minutes. For instance, there are 51 Lega members who received sick comp at some point in those 7 years. I could do a year-by-year analysis, but then I lose the aggregation of the data, which makes it much less interesting to me. I also am generally struggling in conceptualizing how to present these data sets, although I have my best ideas listed with their respective forms of data in the brainstorm sheet. @ebeshero Do these ideas for data sets seem interesting? Do they seem feasible? Are my ideas of visualizing them on the right track? As a person who has not generated more than 3 or 4 different types of visualizations for very simple data sets, I want to try my best to make attractive and more insightful, diverse graphs...