ycoady / UVic-Software-Evolution

Welcome to our extravaganza in software (r)evolution!!!
Creative Commons Zero v1.0 Universal
6 stars 0 forks source link

Lab 7: Data Visualization #13

Open davidsjohnson opened 9 years ago

davidsjohnson commented 9 years ago

For this week's lab, let explore some available visualization packages. I noticed many groups where using Google Sheets or Excel for generating plots and graphs. While these work they tend to be fairly manual process. There are a number of good libraries for plotting data in easy and meaningful ways.

So for this post, generate two graphs from any dataset using a visualization library of you choice. Feel free to use the data from project 1 or the data you plan to use for project 2 (if different).

Some examples you can use:

Hoverbear commented 9 years ago

If you're looking for an easier way to use D3, C3 has some reusable charts.

There is also Raw but I haven't had much success with it.

Myself, @brodyholden, @fraserd added visualizations to our tool. Cool!

screen

knowlesc commented 9 years ago

We used a Python library called pygal which creates SVG charts for Project 1. In our Project 1 graphs, I made it so the data points in the SVG files are clickable and link to the specific file on GitHub, which was really handy to check out outliers.

Here are some screenshots of graphs from our project: click

paulmoon commented 9 years ago

We used matplotlib to plot our data from Project 1. We read our data from CSV files, set the labels and plot configurations, and out came these graphs!

Bootstrap

matplotlib_bootstrap

Backbone

matplotlib_backbone

For reference, the graphs from Google Sheets can be seen here.

Brayden-Arthur commented 9 years ago

We're taking a look at the JFree Chart library for Java. It can read all of the data we need from pretty much any data source and output it as images (png, jpg etc.) or vector files (pdf etc.). Here's an example using some fake data.

Fake Data

Jeremy Kroeker, Brayden Arthur

Bleech94 commented 9 years ago

IMG

In our analysis of the Apache Ant mailing list archives we had to parse the mbox files into csvs using Python libraries for meaningful work to be done with the data. Once we had the csvs we were able to run an analysis on the data using a Java algorithm. Once we had the data processed we used libre office Calc to create graphs of our data. Tada!

-Brandon Leech and Jorin Weatherston

EvanHildebrandt commented 9 years ago

We used matplotlib to analyze the same data we collected for project 1

Legend

The scale and y axis of the graphs show the number of additions and deletions Green means that the added/deleted ratio was ~1 and thus might be a refactor Red means that a large number of files were deleted and less added Blue meas that a large number of files were added but not many deleted

The first graph shows JQM

jquery-mobile It shows how volatile the project has been and how its re factors come in waves

The second graph shows Rails

rails You can see how there was a period of many re factors but that there has not been a big one since 2010 besides the one outlier in 2013

Team

Evan Hildebrandt Jason Syrotuck Keith Rollans

DigitalCoffee commented 9 years ago

Andrew and Devin-

We ran out of time! Tried to get matplotlib working on the lab computers, without success. In these last few minutes, we tried using pygooglechart instead, which hits the google charts API (currently deprecated, soon to be turned off, but it works right now!). We didn't have time to make it pretty, or run it on our whole dataset, but here is an example chart we made based on 9 months of data.

line-stripes

mitchellri commented 9 years ago

We're working on it! Please stand by

During the lab we had issues with the lab computers and matplotlib, so we could not finish this on time. We do, however plan to post our graphs once we get it working.

Mitchell Rivett Tyler Potter