obdurodon / dh_course

Digital Humanities course site
GNU General Public License v3.0
20 stars 6 forks source link

Test #57

Closed Idi0teque closed 5 years ago

Idi0teque commented 6 years ago

Hey, just wondering when the XSLT test will be up? Also, Prof. Birnbaum never did mention: what's with the dislike of pie charts?

djbpitt commented 6 years ago

@Idi0teque The XSLT test is up; there's a link to it in the XSLT section of our main course page. We'll also add an “assignment” on CourseWeb so that there will be a place to upload your work.

As for pie charts: what do you think about their strengths and weaknesses? A standard compound question to ask about a visualization is: What story does the visualization tell, and how effectively does it tell it?

JosephDRogers23 commented 6 years ago

I've always thought that pie charts are good for finite data - like if the data set adds up to 100% and has no missing aspects, such as percentages of hair color in a classroom. I was thinking through it yesterday and I guess that pie charts wouldn't work as well for our markups if there's ambiguity in the tagging - for example, if we mark up elements of magic in our magic realist texts and someone else could interpret the magic event to cover more words or less words, then doing a pie chart of character length of all magic elements in a story wouldn't make much sense. However, I think it would make sense in terms of displaying distribution of the authors of the stories we marked up since there's a finite amount (although that would be a lot of effort for such a simple and kind of useless chart) - does anyone disagree?

djbpitt commented 6 years ago

@JosephDRogers23 Thanks, Joe! Here are a few questions (to everyone) to help keep the discussion open:

Concerning pie charts in general:

  1. Does the number of wedges influence how effectively pie charts communicate their message?
  2. How easy is it to judge the difference in the size of wedges that are not very different from one another?
  3. How do pie charts compare to other visualizations with respect to the two preceding questions?

Concerning ambiguous data:

  1. To what extent are ambiguities more of a challenge in pie charts than in other kinds of visualizations?
  2. Could ambiguity be addressed by having a separate wedge for “maybe”, alongside those for “definitely yes” and “definitely no”?
richiebful commented 6 years ago

Concerning pie charts in general:

I've heard that humans are better at understanding differences in length than differences in area, and that's why you should use a bar chart or something else to measure a "parts of a whole" relationship. That might play into both question 1 and 2 about pie charts in general.

This is the wrong answer, but pie charts seem like they'd take a bit of work to render in SVG, so I'd rather stick with a bar chart or use a different programmatic way of generating pie charts like matplotlib (Python visualization library).

richiebful commented 6 years ago

I didn't put a lot of thought into other parts of the chart, but here's a stacked bar chart example of "parts of a whole" data. It could be a lot less complicated if you're just showing one category (Trump approval rating in PA alone). fakenewspolling

gabikeane commented 6 years ago

hello all! I'm back and I'm alive! Firstly, I'll point out that while libraries are awesome for data visualization, you can't use them for this class. If you can't make it yourself, sadly, you can't do it here. Libraries are useful, but for our purposes not productive for the kind of learning we're doing. Secondly, that's a nice stacked bar chart! How'd you make it?

richiebful commented 6 years ago

I'm confused. I'm not allowed to make charts with a library for our group project?

I made the chart in LibreOffice Calc (Excel, but Linux compatible) but you can do something similar in Google Sheets. Just select the data, insert a chart, and flip the stacking option on. For how I represented the data, I had to check the "Switch Rows and Columns" box. screenshot from 2018-03-25 21-25-52

gabikeane commented 6 years ago

As it says in the course description, we emphasize coding it yourself. While those programs are useful, they aren't original. "Because the primary purpose of the project is for students to learn how to use the Digital Humanities tools and methods employed by professionals, the instructors will work with you during your weekly project meetings to help you learn to apply the necessary methods to your own research, and obtaining your results according to those methods is part of the evaluation of the project." From our Course Description

gabikeane commented 6 years ago

So the stacked bar chart you made in Excel is pretty easy to create with XSLT and SVG, and allows you to do all kinds of personalization and transformation you want, and if your input data changes, can be changed quickly and easily. We'll practice doing stacked bar charts in the upcoming assignments, and you can adapt what you learn in those assignments for your project.

rmb165 commented 6 years ago

I'm having a bit of trouble with the part of the test where I list the characters in each scene. I'm using the for-each function, but it lists the characters in every scene rather than the just the scene they are supposed to be in .

gabikeane commented 6 years ago

for-each will iterate over all of the things in whatever your template matches, so if you want to match the characters for one specific scene, for-each might not be the best approach.

djbpitt commented 6 years ago

@rmb165 Separately from the question of whether you use push processing (<xsl:template> and <xsl:apply-templates>) or pull processing (<xsl:for-each>), you have to be sure that you’re processing only the nodes you want to process. If you want to process the nodes of a particular type (e.g., <character> elements) for just one scene and you’re getting all of the nodes of that type in the entire document, the most common explanation is that you’re selecting nodes to process with an XPath expression that begins with a double slash. Since an XPath expression that begins with a double slash looks on the descendant axis from the document node, you get all nodes of that type in the entire document. If you want just the nodes in a particular context, don’t begin your XPath expression with a double slash. Start instead from the context within which you want to operate.

To return to the question of push vs pull: It may feel more natural to new coders to use pull processing (<xsl:for-each>), but it’s often more XSLT-idiomatic to use push (<xsl:template> and <xsl:apply-templates>). There are a lot of situations where the choice doesn’t matter, but because learning to use push takes more practice, and because there are situations where the choice does matter, we’d encourage you to train yourself to use push unless there’s a good reason not to.

mtm80 commented 6 years ago

What kinds of uses is everyone finding for SVG charts? What do you plan to track in your research? In the case of the Russian project, I'm thinking that we could track trends like whether the candidates prefer to use verbs, nouns, adjectives, etc. and make charts by candidate to quantify speech patterns. Maybe also, track references to particular topics like the economy, foreign relations, and the military. This would obviously be a useful tool for our analysis but also for anyone who uses our site.

brucknerp commented 6 years ago

@mtm80 We've been tracking sentence-final particles, pronouns, and honorifics in our songs, and are considering doing something similar to what you mentioned, instead by quantifying linguistic feature use by gender of the feature, gender of the singer, and time period.

However, I feel like our data would be more fun to interact with with some sort of JavaScript functionality, so I'm not sure how our visuals are going to work yet--trying to think of something besides a bar graph.