Closed venkatrajam closed 3 years ago
Tool Selected - Timeline.js
Introduction -
TimelineJS is an open-source tool that enables anyone to build visually rich, interactive timelines. It enables storytelling through timelines. Beginners can create a timeline using nothing more than a Google spreadsheet. Experts can use their JSON skills to create custom installations, while keeping TimelineJS's core functionality. The timeline can be published directly or embedded in one’s website.
Some examples -
Revolutionary User Interfaces The Republican Run Up Whitney Houston
How Wine Colonised The World How ISIS expanded in one year from 2 countries to 10
How to use -
4 Simple Steps Step 1: Create a Google Spreadsheet and enter the details. Step 2: Publish it to the web. Step 3: Copy and paste the URL to the specified location on Timeline.js. Step 4: Share the link directly or embed it on your website.
Sample Data -
Narrating the story of design projects done by me over 4 years. Here is the link to my class presentation.
Some Tips -
Do not leave a row blank. The visualisation will not work. List of media that will be supported for the publication. Plus Youtube (it’s not there in the list). The easiest way to publish personal pictures/photographs taken by yourself as a part of the visualisation would be to upload them to your Google Drive (opinion) and then upload the link. Enter year in negative example: -100 to denote 100 BCE. Read the FAQs - They cover everything. The visualisation you publish will be publically available to everyone (so beware of what media you are publishing). Do not keep more than 15 events as one does not realise when it will end (opinion). Do not keep narratives which require a lot of back and forth and it is not very convenient to keep switching (my opinion).
Overall -
Timeline.js serves as an objective, storytelling tool which allows one to include different types of media. The best part about the tool is that it works on a Google spreadsheet which when updated, directly updates the visualisation as well. It is well documented and easy to learn and use to create communicative interactive timelines. From a design perspective, it provides some flexibility to edit the background colour and choice of font pairs with the usage of Google Sheets. Experts in coding can make more changes to the design attributes. The inclusion of details such as allowing one to create paragraphs with basic html tags and grouping events to create categories in the Google Sheet itself made it accessible to use. I think the tool is very powerful and useful in communicating stories online taking into consideration the different types of media that build to a story in the digital world (eg. social media, online videos, audios etc) especially news which consists of chronological events, and lifetimes of people.
Limitations -
The timeline can only be done horizontally. Visual design parameters are not flexible enough. No separate dashboard for the manipulating design parameters, they are distributed at different parts of the tool. No complete control over the units of the timeline.
Additional Information -
The timeline tool has been now developed to allow one to choose a particular even on a timeline and skip to it. One can also zoom in and zoom out the scale of the timeline. Knight labs has also developed similar interesting storytelling tools which you can check out here.
About the Tool Orange is an open-source component-based visual programming software package used for data visualization, machine learning, data mining, and data analysis. This is a very powerful tool and it is like the one-stop solution for pre-processing the data, visualizing the dataset using graphs, all inbuilt machine learning algorithms, test and score features for measuring the accuracy of the algorithm on different datasets.
Components in Orange are called Widgets and visual programming is implemented through an interface were widgets are connected to form the workflows.
Pros:
Cons:
The orange tool is well documented in their website. They had provided examples and workflows and tutorials to work with the tool. On an overall note, the tool is really fast and powerful. It's easy to learn and very helpful for rapid visualizations. Tutorials for all the features are clearly provided in their YouTube page
References Here are the links to the tutorials and examples that I went through
About
Get Started
Some of the features
About map tiles
Some examples
Other options for map APIs
Palladio is a product of Humanities + Design labs of Stanford University. As it is a relatively new tool and still under active development, you may find some bugs while using it. The following are the main features of Palladio:
Maps
Graphs
List view
Gallery view
Apart from these four main features, there are three more filters which are very useful to analyse data in the above mentioned visualization.
Timespan: Used to analyse how a specific dimension has changed over a period of time.
Facets: Visualizations can be made and analysed by selecting values of a specific or multiple dimensions
Timeline: Used to visualise and analyse how time dependent attributes can be varied over time.
Overall, I realized Palladio can be a great tool to analyse large datasets having many attributes especially in relationship with location and time. But may not be an efficient tool to present the data aesthetically.
Click here to access my presentation
Introduction Google Data Studio is a free tool offered by Google that turns your data into informative, easy to read, easy to share, and fully customizable dashboards and reports.
How to use it
Features
Some Examples You can find them as soon as your open Google Data Studio.
Pros
Cons
Tips and tricks
My presentation + video recording Would be added after the presentation.
A sample that I created on Google Data Studio Would be added after the presentation.
What is Chart.js? Chart.js is a javascript library for building flexible charts using the HTML5 canvas element. It is a community-built open source library, with around 99 contributors so far. Available under MIT license, Chart.js was started in 2013.
What does it do? Chart.js has a bunch of different chart types:
Some Samples can be found on the Chart.js website.
Prerequisites Very basic understanding of object-oriented programming, or some idea of how Javascript works. There are some tutorials and good documentation as well.
How it works In the case of very few data-points, data can be mapped manually. Larger datasets can be input via: .csv files Xcel files JSON APIs
A simple chart I made in Chart.js:
Cons
Pros
About the company Plotly is a technical computing company headquartered in Montreal, Quebec. Company offers a suit of analytics products.
About Chart Studio Chart studio is one of the fastest way to create interactive charts online. It is a web based tools with a a library of visualization templates which can be used for data visualization. It is an open source platform because of which any work done on free account is kept in public.
Motivation behind Plotly
Broad features of Plotly Chart Studio
Pricing
Types of charts you can create on Plotly Chart Studio
The image above shows all possible visualizations on Plotly. It is a fairly powerful tool with more chart types than competitors like Datawrapper or RawGraphics. 3D visualizations are also provided.
Each chart type comes with 3 default options-
Plotly Pros
Plotly Cons
Caution while using Plotly
About Datawrapper
Datawrapper is a simple yet powerful, non-coding web-based data visualization tool that can be used to create simple charts, maps and tables. Data visualizations created with Datawrapper can be either be embedded as interactive data viz. artifacts in your website or content management system, or be downloaded as static visualizations to be used in publications, documents or further refinement through tools such as Illustrator or Photoshop.
It also boasts of certain community features in the form of The River, which Datawrapper's publicly-available collection of visualizations created by Datawrapper's users that may be used for inspiration or a starting point for your own work.
How to Use Datawrapper
Datawrapper requires no coding skills and can be used right away by uploading Excel or CSV sheets, linking shareable Google Sheets documents and simply copy/pasting data from those tools directly into an available text field that parses and detects the kind of information (such as labels, strings and numbers) that has been inputted.
Charts
Datawrapper allows users to create from a number of chart presets based on the data, though certain chart types such as spider charts are not available in the current offering of the tool, especially compared to other non-coding tools such as Plotly, Raw Graphs or Tableau. Users must choose the appropriate chart type for their data when generating the chart since the tool, for some reason, allows one to select chart types that would actually not work for the nature of data provided (such as a pie chart for data of India and USA's democracy indices through the years).
Another minor issue that may crop up is that the tool may confuse between the dependent and independent variables when switching between chart types (from Line to Stacked Bar, for example). This can be solved by going back to the 'Check and Describe [Data]' step and switching the rows and columns.
Chart Types
Maps
Maps in Datawrapper are pretty standard, allowing users to create maps amongst Chloropleths, Symbol Maps and Locator Maps. What's great about making Datawrapper, however, is that while you can use your own custom maps, they already have a great selection of pre-existing maps, even something as granular as electoral constituencies and revenue circles of the state of Assam.
Again, users can either upload Excel or CSV sheets, share Google Sheets or simply copy/paste data for corresponding map ID data (such as for states when plotting literacy rates of states). But users must be careful to make sure that labels in their sheets correspond to the Label IDs provided in the map. For example, when choosing a 2020 map of India post-Article 370, users' should include Ladakh as a state.
Tables
The table creation feature in Datawrapper is quite self-explanatory. Datawrapper allows you to generate tables based on your data, but this may not have much value for designers who can achieve a higher level of customizability in a tool such as Illustrator or InDesign which already allow creation of tables. Something as basic as font is not customizable.
But the table creation options may be useful to those looking to generate neater tables than are possible with Excel or Google Sheets and leverage the use of the 'Search in Table' feature and quick customizations such as making tables striped.
Pros of Datawrapper
Cons of Datawrapper
Final Thoughts
Example Visualizations Created with Datawrapper
Total Backlogged Cases in U.S. Immigration Courts Hits Historic Highs Number of Universities and Colleges Statewise in India Tracking Covid-19 Hotspots in Iowa
Link to the Presentation
Circos is a popular, highly flexible, open-source software package for the circular visualization of complex datasets, created by Martin Krzywinski. Though it is popular in the field of genomic analysis, Circos enables graphing of any analytical data. Circos is controlled by plain-text configuration files, which makes it highly customizable and can be automated. Another important aspect of Circos is that whatever you create with it will be aesthetically pleasing. It uses a circular composition to show connections between objects or between positions, which are difficult to visually organize when the underlying layout is linear or a graph.
Since Circos is controlled by plain-text configuration files, it doesn't have any interactive user interface. Which makes it difficult to use for those who are not trained in programming or the UNIX command line. I spent a lot of time trying to understand the complex system to get it installed on my machine. Luckily I found Circos Online that allows less flexibility and customization options, but can still be used for visualizing simple tables. The Maximum row + column total you can use in the online tool is 150 and if exceeded, rows and columns are limited to 75.
How to use Circos Online
I used a small dataset to try the tool for the first time. It was the sheet we used to mark the tools we picked and the days we are presenting it.
For this to work in Circos I converted the data to the following format:
This table was saved as a Tab Separated Value (.tsv) format. Once it got uploaded I clicked on the Visualise button and after a few seconds this is what I got:
I was able to download the generated visualization in multiple formats (large image or a compressed folder with data, images (PNG/SVG), and configuration).
Then I tried the same thing with a bigger dataset. I took the one containing the sta-wise number of custodial deaths between 2001 and 2012. Make sure you don't have any blank cell in your table, otherwise the tool will give you a scary-looking error page. You can use hyphens to fill the empty cells.
Let's see what I got
Introduction RAW Graphs is an open-source data visualization framework built on D3.js. It is a tool that aims to bridge the gap between spreadsheets and vector graphics editors. The project is led and maintained by the DensityDesign Research Lab (Politecnico di Milano) and was released publicly in 2013.
Types of charts available :
Step 1 : Enter your data from a csv or spreadsheet. guide for stacking data for RAWGraphs here.
Step 2 : Choose a chart
Step 3 : Map dimensions, drag and drop to respective fields based on whether they are strings, numbers or dates
Step 4 : Customise your chart and download as png / svg
Advantages:
Limitations :
Not very customisable
Examples Some featured examples created using RAWGraphs.
Learning An extensive guide for how to use each of the available charts here.
An very simple yet powerful tool used to clean up and structure your dataset
OpenRefine (previously Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
Messy data is usually data that has been fed in manually, this type of data will have a lot of typos, spelling inconsistencies, and different formatting for the same data. We usually find this especially in Indian datasets where the same is spelled differently in different locations.
OpenRefine always keeps your data private on your own computer until YOU want to share or collaborate. Your private data never leaves your computer unless you want it to. (It works by running a small server on your computer and you use your web browser to interact with it). There is extensive documentation that can be found on OpenRefine and what I mostly did and I suggest is to follow the foundation course from the documentation page.
Getting started with Open Refine The interface is fairly simple once you load in the data, sometimes you will have to do some adjustments, on OpenRefine itself before starting the project, when you have to ignore 1st few rows etc. After loading the data you can see it resembles a spreadsheet, but here you are working mainly with columns and not cells. The main difference is that OpenRefine helps more with bulk editing.
figure 1.
In figure 1 you can see fig1.(1) OpenRefine uses filters or facets to sort your data, fig1.(2) you can change the view, and also once you sort the data you can see the changes on the fig1.(1) Uno/Redo tab. You can see the history of changes made and can go all the way back to the first step. Whatever changes you make in OpenRefine won't reflect on the original file
figure 2.
OpenRefine recognizes different types of data, sometimes it's encoded in the dataset, other times the user has to do that manually. Here fig2.(1) under the heading CURRENT_USE I have used the Text face to show all the different kinds of data under that column and this list appears in fig2 (2). This is an example of messy data where Apartment Building has been written in so many ways. The feature called cluster fig2(1) helps to fix this issue to help rename these inconsistencies and also delete extra spaces in the cells.
figure 3.
An algorithm fig3.(1) is used here to group data that Refine detects and if the user thinks it makes sense to group that data, "fingerprint" is the one with the strictest threshold and as you select the others it becomes more lenient. The user can check the categories they want to merge fig3(2) and also name the category as per their choice then select merge & Re-Cluster to save changes. By the time the user uses the 3rd threshold all the categories in fig3.(3) will become one.
Refine has its own expression language to manipulate data using programming as well, it called GREL( Google Refine Expression Language) but I did not get a chance to explore its possibilities that much, other than the usual splitting and combining columns. There are a lot of comparisons between OpenRefine and Excel or Google Sheets, but where Refine is better at excel is batch editing and working with inconsistent data. This has been just a very basic overview of the tool and one can learn a lot more by just going through the tutorial that was mentioned earlier for a more comprehensive guide.
Other Resources to check out
Processing and P5 looks a lot similar and used interchangeably. So, its important to understand the difference between the two.
Generally, the steps involved in data visualization with P5 includes( these steps are based on my observation)-
Figure 1 below shows a data visualization of volcanic eruptions over the past thousand years that I created with p5,js. Data source: Kaggle.com
Figure 1
P5 was originally created for creative coding. And when creative coding merges with data visualization, it makes P5 a powerful tool for data art. One example is shown below in Figure 2, Trees of translation, created by Baltazar Pérez. It visualizes human text-writing and translation processes.
Figure 2
Figure 3( original article here) shows one such example where the entire movie can serve as an input file. Programs like P5 can then read colors in each frame which can be built into one such visualization. There is no need to create text/numeric data files first. Even if this data is realtime( on inputs of camera frames ), P5 can work at its best.
Figure 3
A network analysis tool
Download Gephi Their tutorials An interesting introductory blog Slides for presentation
Introduction:
GEXF GDF GML GraphML Pajek NET GraphViz DOT CSV UCINET DL Tulip TPL Netdraw VNA Spreadsheet
Applications
Goal
Network
The researchers used Gephi to create this visualization This is an ingredient recommendation system. The software helps recommend a complementary flavour which goes well with the ingredient in your mind. The nodes, like- salt, water, lemon juice are individual ingredients. The line between these ingredients signifies if these go well together.
To understand more about the networks and different statistical operations provided by Gephi, please follow this blog
Some examples:
Gephi community on Twitter
Pros and Cons
But,
Some youtube videos that helped me understand the tool better- 1, 2, 3, 4
Developed by Mike Bostock and team in 2011
D3.js is a JavaScript library which you bring data to life using HTML, SVG, and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.
Purpose of Use
Why Choose D3
Learning Resources
D3 Graph gallery
From where can we learn
Pros
Cons
The great customization demands writing code for each element present in the visualization. Screenshot of code snippet for the creation of a simple bar chart in D3 using D3.live online D3 platform .
When to use
When can one avoid using
Conclusion
Text as Information to be consumed vs Environment to think in
Explorable Explanations introduced 3 frameworks
Tangle JS is a library used for creating reactive documents.
is a markdown implemenation fo Tangle JS that gives a quick look at its capabilities.
In this assignment each of you will select one of the following 18 tools, explore it as thoroughly as you can (download, install, tryout, and use it to create something), and do a demonstration/overview of the tool to the rest of the class (20 minutes).
The objective is to introduce the tool to the class, and highlight its possibilities & limitations so the audience can make a well informed choice of available tools. We will do 4 tools per day starting from Thursday onwards. Advait will coordinate and assign the tools. This is a credited assignment.
In your documentation, include links to the resources you used (if any) in your presentations, capturing your personal insights about the tool and related resources.
There are more tools here and here. If you want to pick a tool that is not listed above, discuss with me.
For how to document your work, take a look at what the previous batch did with this assignment.