info-design-lab / DE705-Interactive-Data-Visualization

Documentation of the IDC M.Des course Interactive Data Visualization, 3-20 Sep 2019
2 stars 0 forks source link

Redesigning The Hindu Data Point Stories (2020) #4

Closed venkatrajam closed 3 years ago

venkatrajam commented 4 years ago

For this assignment, we'll use data stories from The Hindu Data Point.

Select a story that you like, study it carefully and redesign it. Specifically I want you to focus on understanding the data that powers the story, and how it is visually encoded to tell the intended story. Document your design process, capturing the following:

What is the story the author is trying to tell? What the data he/she is using to tell the story? Describe its details -- type of data, extent of the data, dimensions of the data, gaps in the data, what data is essential and what is irrelevant. How is it encoded, problems with it and how you attempted to improve it. You may choose to expand or curtail the scope of the data used in the story, or add an additional dataset to tell the story better. But do not deviate from the main intent of the original story. In other words, it is a redesign exercise, and hence I do not want you tell a different, unrelated story.

While you should provide a link to the original story, it might be useful to capture and display inline, appropriate parts of the original visualization, and your own design iterations to produce a coherent documentation.

For reference, take a look at what the previous batch did with this assignment.

Abhi98krishna commented 4 years ago

-----Work in progress-----

Original Article by The Hindu

Facts Only three Peace laureates were outside western Europe and North America until 1950 Gandhi nominated on 5 occasions, but never awarded Most Peace Nobels in the first part of the 20th century (until 1960) was secured by those from Western Europe and the USA The first non-American and non-western European to win the prize was in 1936, 35 years after the prize was instituted The first African to win the prize was in 1960 The first Asian won the Peace Prize in 1973, 72 years after the first award was given.

Claims Fact 3 may have led to this omission Nominees outside Europe and USA weren’t given awards

Original Visualization:

Original

Problems with the visualization:

Ideas:

Frame 4 Thoughts:

Prize with Nomination Thoughts:

trotro Thoughts:

Re-designed visualization:

Trial3 Thoughts:

venkatrajam commented 4 years ago

Good effort Abhijit. Annotation: The original viz uses annotations to highlight some salient points (first winner outside western Europe/Americas, first African, first Asian etc.) which you could retain in your viz too. Also it would be useful to add the numbers next to the bars -- while we know the relative quantities, we do not how much there quantities are. Visual detection: You can add the y-axis scale on the right side too, and add some delineating space between, say every 5 years or every decade so it is easy to track. Other than line space, you may also use other visual delineations such as lines, background colour etc.

tdeepikatiwari commented 4 years ago

-WIP-

How many GI's Does your state have? Article GI Documentation Data

The article is an educational article with intent of communicating all GI's in the country. The visualization in article tries to normalize GI's for each state against the area. It explains the concept of GI - a sign used on products that have a specific geographical origin and possess qualities or a reputation that are due to that origin. These products are split into 5 categories -

  1. Agricultural
  2. Handicrafts
  3. Food related
  4. Manufactured
  5. Natural

Original Visualization by The Hindu Screenshot (664)

Issues with existing visualization - Visualization 1

Visualization 2 & 3

Scope for redesign All GI's need to be represented geographically to retain the link of geographic indication. Representing it in charts reduces the geographical significance.

Attempt 1

I started with trying to plot all GI's on India map in a symbol chart to see if any natural clusters emerged. I also added color and number of GI's as dimension to help users identify clusters. However, post collecting data, GI's seemed to be spread sufficiently across country with exception of some states. Context of types of GI's was also lost.

dataviz-01 - Copy dataviz-01 dataviz-01 - Copy (4) Additionally, color code on donut chart implies an order where there is none. Sorting by population is not evident for readers in the first time.

Attempt 2

dv-01

To Do -

  1. Visualize frequency of GI category with reference to population
  2. Visualize predominant categories of GI's
AkhilGuthula commented 4 years ago

Where India's mobile Internet speed ranks globally, which operator offers the fastest download speeds, and more

click here to access article What is the story the author is trying to tell? In this article, the Author talks about India's internet speed and where it stands globally. The author has used the following table (Figure 1) to show where India stands among the BRICS nations and other countries which have the Fastest and slowest internet speeds. In the second part of the article, the author talks about the average download speed across circles by major operators in the last six month period, which can be visualized through the following bar graph (Figure 2) and table (Figure 3). ArticlePics

Class Discussion:

Redesign:

Instead of providing a list of all the countries with respective internet speeds which might overload the viewers with extra information, a color coded world map by labeling important/required countries which gives the viewer a basic understanding of Internet speeds across the world could be more efficient. ag8jx-mobile-internet-speed-across-various-countries (1)

Author has emphasized more on the 11 countries which are mentioned in the table ( top 3, BRICS nations and last 3). Instead, representing them on a bar graph, arranged in Rank wise, would be easier for the viewer to compare the internet speed between the countries just by looking at the height of the respective bars.

I though, the percentage of the population who subscribed to mobile internet in a specific country could have an impact on the internet speed in that respective country. The relationship between the internet speed and the number of mobile internet subscribers could be an interesting visualization which might give more insights. But I couldn’t find any significant relation between these two attributes. As we measure internet speed in bandwidth, I wanted to make the visualization look like a band. That is the reason why I have chosen stream graphs to visualize how the number of internet users changed over time. Since it started from 0 and gradually increased, the visualization may not look like a band as I expected it to be.

TheHinduData_Article5

The bar graph represents the internet speeds of 4 major operators in India along with their position globally.

TheHinduData_Article6

In the article, the performance of various operators in various states of india was represented in a tabular format. But, the small multiples can be used to visualise how various operators are preferforming across India in a better way. But if the values are populated on the map, it might be overloaded for the viewer, which made me consider Sunburst graph to represent the quantitative data.

TheHinduData_Article8

In the Sunburst graph, I have color coded the 4 different operators, the state gets the same color of the operator who provides the fastest internet. The speed of the internet is also color coded, darkest color being the fastest and the lightest color being the slowest.

TheHinduData_Article7

sugandha-123 commented 4 years ago

Initial study: The article.

The article that I picked up was Hunting in pairs: a look at the best bowling partnerships in Test cricket.

The article was written on the occasion of English pace bowler Stuart Broad becoming the seventh bowler and second Englishman, after pacer James Anderson, to pick up 500 Test wickets.

The article discusses the the best performers in 3 areas.

1. Bowler pairs with 500 wickets or more.

Screenshot 2020-09-30 at 9 40 24 AM

According to the article:

2. Best combined strike rate (for minimum 200 wickets).

Screenshot 2020-09-30 at 9 42 13 AM Screenshot 2020-09-30 at 9 43 14 AM

3. Bowling pairs with most wickets / game (for minimum 200 wickets).

Screenshot 2020-09-30 at 9 44 42 AM Screenshot 2020-09-30 at 9 45 59 AM

Solution: An alternative lens.

From the data that I had, and the calculated fields, it was possible to extract the following columns / attributes.

Iteration 1.

In my first iteration, I decided to pick the bowling partnerships that were mentioned in the article. For me, they represented the "Best performers" in various categories. I also included the best performing partnership of Muttiah Muralitharan, for the reason that he had been the highest wicket taker in test matches itself. I decided to keep them all in one chart and see where they lie with their stats.

I started with placing their information in a table.

Screenshot 2020-09-30 at 10 05 58 AM

Next, I highlighted the boxes that displayed the main reason for the pair being in the table itself. (The reasons / categories of their best performance.)

Screenshot 2020-09-30 at 10 05 46 AM

Iteration 2.

In my second iteration, I took the players mentioned above and tried visualising them on the bases of the wickets taken and their combined strike rate.

Screenshot 2020-09-30 at 10 09 38 AM

Iteration 3.

In the third iteration, I added the player country to the labels, and also included the wickets taken per game as another attribute.

Screenshot 2020-09-30 at 10 10 01 AM

Iteration 4.

From the previous iterations, I understood it is certainly difficult to represent the data for sports persons, in this case, cricket. The graphic that I created does display the information, but does it essentially allow the reader to understand it clearly?

From the feedback given to me, I realised that instead of showing the information about the pairs highlighted in the article, I could just take the data available for the pairs that excelled in one particular way. For example, the pairs that took 500 wickets or more. Also, instead of visualising the information via a graphic, I could highlight a few things in the table itself (as I had attempted to do in iteration 1).

The data I had for the pairs with 500 wickets and more:

Screenshot 2020-09-30 at 10 34 29 AM

The table after selecting important attributes and rounding off values:

Screenshot 2020-09-30 at 11 04 05 AM

I highlighted the highest and lowest values from the table.

Screenshot 2020-09-30 at 11 30 32 AM

Iteration 5.

After a few more adjustments, I looked for insights from table and came up with this infographic.

Untitled-2

rajsreekanth commented 4 years ago

-WIP- Five states including Tamil Nadu recorded over 100 custodial deaths but zero police convictions between 2001-18 Link to the article

The Story The article was written shortly after the death of a father-son duo from Tamil Nadu, allegedly due to custodial violence. The incident sparked anger across the country, with even celebrities and politicians demanding justice.

The article takes a look at the data between 2001 and 2018, on the custodial deaths and the number of policemen convicted in those cases. While calls for a fair probe are growing, differences in these numbers are alarming. Most of these deaths were attributed to reasons other than custodial torture, such as suicide and death in hospitals during treatment. The article puts its focus on five states including Tamil Nadu, where the father and son died while in custody. The data shows that there were no police convictions between 2001 and 2018 in these states.

The following are the visualizations that came with the article.

Screenshot 2020-09-30 at 10 16 15 AM Screenshot 2020-09-30 at 10 16 49 AM

The idea of creating an impact by showing the alarming difference between the numbers was somehow lost in these visuals. The harmless circular form could have been replaced with a sharp angular graph.

Data Data from 2001 to 2018, from all the states in India, includes the following:

Approach I was more interested in showing the difference between the number of custodial deaths, cases registered, policemen charge-sheeted and policemen convicted, without showing the actual numbers on the visualization. The numbers can be highlighted in the article, and the reader would be able to make the connection.

Rough sketches

Screenshot 2020-09-22 at 4 46 47 PM

The important feedback I received was that having different colors make it look like a stacked graph and using the same color with some transparency would make it look like they are overlapping.

Another idea was to show the state-wise number of custodial deaths and the number of policemen convicted over the years, on a scatter plot. The size of the circle represents the numbers.

Screenshot 2020-09-22 at 4 32 28 PM

I was only able to find the data between 2001 and 2012 for this and decided to focus on the first concept.

Refined Concepts

C1

I made an attempt to create a visualization based on the second concept, with the available data. opt_05 It turns out, this concept works better when it is interactive than being static since there are so many overlapping layers. It gives the impact of the story, however, doesn't convey the data clearly.

nishitanirmal commented 4 years ago

What Percentage of people prefer to speak Hindi across States?

Original article

The Narrative On May 31, 2019, the center released a draft of the National Education Policy which included a controversial clause. The clause mandated the teaching of Hindi in schools across non-Hindi speaking states. The draft drew sharp criticism from different political circles in many non-Hindi speaking states, especially Tamilnadu. Soon after, the government issued a modified draft which left out the controversial clause. In this article, the author uses visualizations to explore the usage of the Hindi language through all the states in India.

The Data used The author has used 2011 language census data pertaining to:

  1. The number of Hindi speakers in each state
  2. The number of native Hindi speakers against all Hindi speakers* in each state
  3. The number of people in every state that chose Hindi and English among their top 3 languages respectively
  4. State Population Data

*all Hindi speakers - the assumption in the census is that 'All Hindi speakers' can be calculated by summing up the people who use Hindi as their 1st, 2nd and 3rd language. 4th language and beyond are not included.

The data for the first three points can be extracted from here The dataset that I prepared can be found here

Comments on the data used Initially, it seemed that the narrative was about how non-Hindi speaking states felt outraged at being forced to teach Hindi. For those reasons, the data I wanted to look at was each state's highest performing language against the performance of Hindi in that state. But later, I realized that the narrative was about analyzing Hindi as a common nationwide language, and considering other alternatives for the same. In that case, the data used in the article seemed fitting.

Comments on the Dataset

  1. The authors seem to have miscalculated Gujarat's Total Hindi speakers. They have mentioned it as '15.02%' but after calculating it many times, I believe that it is 43.62% All the other numbers used are right.
  2. Census data contains a lot of ambiguous titles that make it difficult to analyze. For example, the title ' ‘Distribution of the 99-non-scheduled languages - India/States/Union territories', is confusing because they could be referring to either the 'total' speakers or the 'native' speakers.

Visualizations

1. Map titled 'Statewise Split' The 2011 census found that 43% of India’s population speaks Hindi. It is the highest spoken language in India. This map shows that the number, though large, is concentrated in a few of the central states.

The map below is a Choropleth that shows the percentage of the population in each state that speaks Hindi. Encoding: A gradient using 10 bins of color to represent the population. (Each subsequent bin stands for a 10% increase)

chart 1

Encoding problem 1: Initially, I thought that 10 bins were unnecessary because people would not care for that amount of detail in the data. I thought it better to just use 'low, medium, and high' as the bins of 33% each as shown below.

Untitled-Artwork-1

However, there were 2 issues with reducing the bins:

  1. In a country as diverse as India, even 15 - 20% of the statewise population speaking Hindi could not be labeled as 'low'.
  2. States like Meghalaya at 13.9% were clubbed with states like Tamilnadu at 2.11%. This clubbing both misses out on Tamilnadu's especially low percentage and the fact that Meghalaya does have a significant amount of Hindi speakers.

I tried 4 and 5 bins respectively, but they each lead to similarly misleading clubbings. So I decided that 10 bins were best.

Encoding problem 2: The Hindu Choropleth map mainly shows the 'area' of the state. However, in this visualization, it wasn't the area that mattered but the population of each state. I also tangentially, tried to make a map that disregards both, area and population and gives all states an 'equal' status. Trial 1 shown below:

Untitled-Artwork-2

However, this did not really look like the shape of India, and the class feedback was that it did not make sense to even show this information geographically. So I proceeded with making a tiled Dorling Cartogram, using hexagons. Each hexagon = 1 million people. The point was to show the state population by number of hexagons, and the percentage of Hindi speakers in each state using 10 bins. (The Gujarat miscalculation has been corrected)

Frame 1 cropped (I'm aware that the labels are not well placed or legible, I am working on fixing that)

2. Scatter Plot titled 'Native vs Non-native speakers' The chart below plots the percentage of native Hindi speakers against All Hindi speakers. Since 'All Hindi speakers' includes native speakers, the title used is misleading. Note: The labeling of both axes is wrongly switched.

Encoding: The position of each circle represents the percentage of 'native' and 'all' Hindi speakers respectively.

Chart 2

Encoding Problem 1 The visualization should highlight the percentage of Native speakers in each state, but it should also allow a comparison between states. But due to the 'position' encoding, the focus is more on the clustering of the states. Because the state names are not shown upfront, the comparison is difficult.

Encoding Problem 2 Because of the labeling of both Axes, i.e., both deal with 'Hindi speakers' with '1st choice' and '3rd choice', and there is no visual logic to remember which is which, the viewer takes on a large cognitive load to remember how to read the chart. Eg: A. Downward = lower total speakers B. Leftward = lower native speakers C. Up and left = Higher total speakers and among them lower native speakers D. Down and right = Lower total speakers but high native speakers among them This encoding, though passable, puts a lot of cognitive load on the user and volunteers very little information.

My attempt to improve it: Frame 2 cropped

  1. All state names are shown upfront, ranked from highest to lowest total Hindi speakers.
  2. The native vs subsidiary and the total speakers can be compared within the state as well as between states.
  3. There is an 'overview' of countrywide native and total speakers that emerges. It shows a general trend (with some exceptions) - the lower the total Hindi speakers, the lower the percentage of native speakers among them.

3. Scatter plot titled 'An alternative means of communication' The chart plots the percentage of total Hindi speakers vs total English speakers in each state. (Point to Remember: Total speakers = speakers with the language in their 'top 3 choices') Note: The labeling of both axes has been wrongly switched.

Encoding is the same as the previous scatter plot.

Chart 3

Problem 1 There is a difference in the range and granularity of the scales of the X-axis and Y-axis. This is fine, however, there is no visual difference/markers between them and at first glance, this difference is not noted. Since 'position' is the main encoding here, the viewer will subconsciously forget to take into account that the Y-axis only goes up to 45, and read both distances equally.

Problem 2 The point of the data is to show both, the difference in the ratio of Hindi to English within a state and also between states. The visualization only shows the distance between states, and due to the somewhat even distribution of the states, not many insights can be gained.

My attempt at improving this: Frame 3 cropped

  1. Both scales are equal, so the English to Hindi ratio within each state is more clear.
  2. Arranging the states by 'most to least' English or Hindi speakers provided no insights, so I arranged the states alphabetically so they can be found more easily. (Viewers are encouraged to read the chart starting at the bottom so that their eye can trace the scale better)
  3. Highest bars for both languages are marked and the percentage is revealed. The 'lowest' bars in each language are subtly highlighted using borders.
  4. There are transparent red lines leading from the state name to its corresponding bar, to guide the viewer's eye to the correct bar.

Tools used: Tableau, Figma and Tilegrams - a good open-source tool for tiled maps. For details on how to make a tilegrams map compatible with Tableau, read this.

FINAL OUTPUT

For the redesign, I used the same title and mostly the same text from the Hindu Article. I may have edited the text slightly. Please click and zoom to see the text clearly.

Frame 1

divoojilly commented 4 years ago

Where does India really stand amongst the BRICS countries in terms of Gender Gap?

The article that I chose was: Where does India stand in the Global Gender Gap Index?

In the graphs that they used to represent India in the context of the world, I saw some major flaws. Existing SS

  1. There is no standard metric to measure these countries.
  2. The same countries aren't measured throughout (inconsistent).
  3. Biasing the data by trying to show India in poor light (if it is really bad anyway, the data would show it anyway, right?)

My first few ideas: Idea 1

  1. Using all BRICS Countries + Iceland and Yemen as the reference points (as they are the lowest and the highest score).
  2. Also showing the population of women in each of these countries to 'show' how many women are still left behind.

Implementation: World Rank Amongst BRICS (Attempt 1) Alright... This didn't go as well as I thought!

  1. The countries became too small and overlapped a lot.
  2. Why did I try to use the population as a metric? Was it because of my own bias?

Next iteration: Iteration 2 Created a dot chart and showed how the country's score has changed since 2018.

Choropeth Also created a choropleth to show differences in countries visually.

  1. The before and after is clearer here.
  2. All the data isn't on one single line so it does not look crowded (no matter how close the countries are).
  3. The outcome is not biased.

Representation: Representation

For the final outcome, I would be choosing the following layout: WhatsApp Image 2020-10-01 at 03 34 50 (3)

I've created a wordplay on "Dilli Door Hai" which signifies that India still has a long way to go. I have also kept a mostly 'generic' feminine colour palette. "Dilli Door Hai" is a common phrase used to signify that there is still a long way to go. Apparently, one of the Mughal emperors used it while traveling to Delhi.

Final Outcome: What all have I included?

  1. The highest and lowest score (Iceland and Yemen) and scores of all the BRICS countries.
  2. The four parameters to calculate the score + the final score.
  3. A short description of what this graph is about.
  4. A common start and endpoint (which was absent in the Hindu Data Point article).
Artboard 1 copy 2@2x

Please zoom in to read through the details :-)

richavagrawal commented 3 years ago

Gender disparity in early education

The Hindu article chosen is available here.

What is the story the author is trying to tell? Gender disparity in early education

  1. Starts with a claim -

Students in private schools performed better in various tasks than those enrolled in government schools and anganwadis, according to the Annual Status of Education Report (Rural) 2019 - This should be backed by the statement in the ASER referring to the method of instruction and not the data of the final results (as by that method of reasoning one could also claim the results are biased because of the gender ratio and the differing capabilities of the genders)

  1. The inconsistencies to comprehend the argument -

Untitled-Artwork (8)

The grouping of age groups and usage of disorganized bar charts to represent part of a whole.

Percentage of girls and boys in government and private schools

Initial Ideation to effectively combine these:

unnamed (15)

unnamed (16)

Possibility of sing a Spider chart to visualise better -

Explored Spider charts using different scales and parameters -

Process

Decided to go ahead with a scale from 70 to 100 percentage for the completion of a particular activity. Chose six activities, two from each cognitive, language and numerical skills

  1. Using the mother’s literacy as the only reasoning for admitting or not girls in private schools seemed unfair. However I decided to include it anyway as it was a part of the narrative. The issues with the existing visualization were the flipped axes of the dependent and independent variable and the inability to compare each parameter of a category due to use of a stacked bar graph

Mother's education graph

Final Visualization

Spider Chart for the difference in performance ie. boys performing better than girls, pie chart for the reason of that difference ie. differing percentage of private and govt school education for girls and boys and line graph for the reason for differing admission to private and govt schools ie. education of mother

Iteration 1_Iteration 1

Comments -

Could create 3 spider charts to show in greater detail the performance in all 3 subjects of cognitive, language and numerical skills. Data is available in ASER

jon-swn commented 3 years ago

How have climbers fared in the test of Mount Everest?

The original article can be found here

What story is the article trying to tell? The story talks about the successes and failures of those who have attempted to summit Mount Everest since 1953(figure 1). The problem with this visualization is that it uses a double y-axis which makes it hard to read the graph. You can see the dip in no. of attempts in 2014-2015 because of the avalanche and incidentally the year with the highest number of deaths. Data Viz_Existing Chartfigure 1 (Source:Hindu)

The next part also shows a visualization of the main causes of death using a tree diagram(figure 2.) but does not give any valuable information. Death Cause figure 2(Source:Hindu)

The article also shows the data in terms of countries(figure 3) and also between genders (figure 4).

In figure 3 (Source:Hindu) the rate of success is plotted and with the number of failures and successes. Nepal is obviously leading because of its proximity to Everest. Russia with very few attempts has a very high success rate. This chart was one of the better ones out of all of them. fig 3-4_Final Viz 2_Final Viz 2

--REDESIGN--

First Attempt: Data Viz 1 Attempt 1_Line 1970-2020I plotted the attempts of male and female in one graph to show the difference in the numbers. Feedback:The feedback was that using different encoding for both male and female and different colors for the people who summit was adding extra cognitive load on the reader.

2nd Attempt Data Viz_Viz1 Redesign

I made another visualization of the causes of death on the mountain. The y-axis is the height at which death occurred, the x-axis is across time and the size of the shapes show whether they summited or not. Similar articles have mentioned that people have been more successful through the years but the death rate has not changed much hovering under 1%. We have better equipment now to climb the mountain and also to predict the weather. So why is the death rate not decreasing even more? As I analyzed the causes of death I decided to bin a few categories and divided them into Internal and External. "Internal" being caused to sickness and illness and "external" caused by harsh weather conditions, falls, and avalanches. There were some categories like unknown or disappearance for which I used pink and black respectively. The visualization shows that more recently people have been dying due to internal factors rather than external factors. This could be due to the fact that climbing Mt. Everest has become more of a tourist attraction where anyone can pay to climb the mountain without much proper training. The big circles show the people who died caused by an avalanche in 2014. Viz 2 Data Viz 1 Attempt 1_Death_Death

Feedback: I had to do the binning within the legend also, There was some difficulty in recognizing whether shapes overlapped or not. Maybe use two different charts for summit and non-summit.

Visualization II 2nd Attempt Data Viz_ SIde by side_Side by side

Single Visualization Data Viz_Final Viz 2

I tried using different charts for those who summited and not summited but the focus of the story changed. I wanted to highlight the type of death on the mountain. I used 4 bins, 2 that are similar, Fall(Darker blue) could be caused by faulty technique and equipment but avalanches(Lighter Blue) were unpredictable. I combined disappearance with unknown and other as black.

Final Design Data Viz Final -08

Tools Used: Tableau & Adobe Illustrator Export your Tableau worksheet as a PDF and then you can edit it in Illustrator as an SVG or EPS.

Noopurkumarikashyap commented 3 years ago

Where does your state stand on the India Innovation Index?

Original Article by Varun Krishnan, here.

Introduction to the topic:

Indian Innovation Index examines innovation capabilities and performance of the Indian States and UT's. It is measured as an average of Enablers( innovation inputs) and Performance( innovation outputs).

About the Article:

This article answers two questions:

  1. What is the rank of a particular state?
  2. Why that specific rank?

Story of the article:

The writer tries to bring a bigger picture first. He talks about the global innovation index and its comparison with other emerging nations. And then a more focused comparison is made among the states of India based on enablers( innovation inputs) and performance( innovation output) providing grounds to state ranks in the Indian innovation index.

Observed structure of the content:

  1. Global Innovation Index 2019 + 52th global rank of India.
  2. Comparison of BRICS Nation( smaller picture).
  3. Relative comparison across states.
  4. Input-output gap( reasons for a particular rank).

As per the article points 1 and 2 above forms the secondary information and 3 and 4 together form the primary information.

Observed objectives of the Data visualisation in the article:

  1. Summarise Global Innovation Index of India
  2. Summarise Indian Innovation rank
  3. Explanation of state rank with Input-Output gap
  4. Tells a story(GII -> III -> higher performance of southern states ->I/O gap )

Classification based on Intent:

  1. Narrative( it tells a story)
  2. Explorative( it allows readers to explore and compare data)

Data:

Source for Global Innovation Index is here and the source for Indian Innovation Index is here. The required data were extracted and cleaned for the purpose of use in this project.

Types of data-

  1. States and countries are nominal.
  2. State and country rank is ordinal.
  3. Global and Indian Innovation rank ratio.

Identified Problems:

The data visualizations of the article are shown in Figures 1, 2, and 3. The identified problems are listed below.

  1. The global picture does not have a comparison with the highest and lowest rank holding countries. This comparison can provide a better global picture as per the content of the article.
  2. None of the three visualizations provide an easy comparison of state ranks.
  3. Saturation in Figure 2 does not clearly define the state rank or score.
  4. There is no relevant relation between the geographical location of a state and its Indian Innovation rank as per the content of the article. State rank is hard to compare without numerical data.
  5. The three categories of the states and UT's is not required as per the content of the article.
  6. The Input-Output gap is not clearly visible in figure 3. It tells more about the ratio of two quantities and less of the difference between them.

iii1 Figure 1

iii2 Figure 2

iii3 Figure 3

Redesign:

Ideations:

data viz

Final Visualisations

DV_assignment2222c

rishi4git commented 3 years ago

70 Years of Pending Cases in INDIA

Original article: 77 cases filed in the 1950s still pending in courts across India. Link

Story of article: The Hindu article begins with the mention of conviction of a case filed 35 years ago. The article also talks about the pending cases since the 1950s with a duration of 10 years in a table.

The article also highlights how much the pending cases have increased significantly since 2010. "Out of the nearly 3 crore cases pending, 2.6 crore were filed after 2010" the article mentioned

At last, the Pending cases in Uttar Pradesh which seem to be significantly high as compared to other states. "Nearly one in every four pending cases across the country are from Uttar Pradesh (73.1 lakh)", the article mentioned.

*Only available data in the visualized form was the following table Screenshot 2020-10-04 at 11 32 1

Focus of my visualization was

  1. Depict the journey of pending cases.
  2. Highlight the difference in pending cases in each state to highlight the outliers.
  3. How significantly the cases have increased since 2010.

10 states with the largest Number of pending case

Screenshot 2020-10-04 at 11 34 20 PM

Traveling back to the pending cases - View link for Interactive prototype https://public.flourish.studio/visualisation/3912141/

Highlighting the oldest pending cases till date

Screenshot 2020-10-04 at 11 24 44 PM

The size of the bubble here represents the total number of pending cases in that particular state till date.

Increase in pending cases since 2010 Sheet 1

Tool used : Flourish and Tableau Desktop Data Source:National Judicial Data Grid

raaghavlaxman commented 3 years ago

What is the share of death sentences among sexual offence cases?

Link to the article here.

the data presented in the Hindu article in tabular form & as percentages, Colour (saturation) used to indicate the value of the percentages. hindu datapoints1

Intial redesign idea, as a stacked bar graph

death sentence stats-04

dikshasingh13 commented 3 years ago

Domestic violence complaints at a 10-year high during COVID-19 lockdown

Link to the original article

The narrative: The authors of this article discuss the statistics on domestic violence in India. They talk about how during the first phases of the COVID-19 related lockdown, Indian women have filed more domestic violence-related complaints than recorded in a similar period in the last 10 years. They also bring to attention that even this spike might just be the tip of the iceberg since 86% of the women who experience domestic violence in India don’t seek help. They focus on the alarming rise in the number of complaints and the state-wise numbers. They also discuss that even these numbers do not make sense since most women who suffer from domestic violence do not seek help. They also bring out a haunting statistic, that even among the women who sought help, only 7% of them actually reached that authority, the majority of the women talked to their families.

Since the data for the lockdown domestic violence is not available, my focus was on the last two sections:

Buried in silence About 86% of women who experienced violence never sought help, and 77% of the victims did not even mention the incident(s) to anyone. The table shows that women who were subjected to both physical and sexual violence seek help relatively more than those who suffer from only one form of abuse. Capture

Under-reporting Among the 14.3% of victims who sought help, only 7% reached out to relevant authorities — the police, doctors, lawyers, or social service organizations. But more than 90% of the victims sought help only from their immediate family. Capture1

Problems:

Interventions: I have combined the two sections of the story mentioned above since they are attributes of the same dataset. Moreover, to give a better impact and idea to the users, combining these two would put things into perspective.

Ideation: unnamed

The Final Visualization:

Assignment 2 (1)

raaghavishan commented 3 years ago

----WIP----

Which districts have more number of C-Section deliveries?

Article

This story is about analyzing which districts have exceeded the limit of C-Section deliveries that was determined by WHO. Though C-Section deliveries reduce the rate of delivery mortality WHO insists that C-Section deliveries in a particular region should not exceed 15%. But the analysis in 2016 shows that the southern states of India have exceeded the limit by a large margin. Central, north, and northeast the WHo limit is not exceeded much but the percentage of C-Section in private hospitals is higher than C-Section in public ones. Capture Sheet 1

arinjitdas commented 3 years ago

How Has the State of Democracy in India Changed Since 2008?

Original Story: Data | How has the state of democracy in India changed since 2008?
Background: The Economist's Intelligence Unit has been publishing an annual Democracy Index (with the exception of 2009) in which it assigns 167 nations of the world a Democracy Index score which measures the the state of their democracy. The index is an aggregation of 5 parameters: Electoral Process and Plurality, Functioning of Government, Political Culture, Political Participation and Civil Liberties. The Hindu Data Point article above uses this data as the source to weave a narrative about the state of India's democracy since 2008.


The Narrative of the Data Story

The article attempts to illustrate the change in India's democracy since 2008, particularly highlighting India's decline on the index, especially in certain parameters such as Civil Liberties. The data story is told exclusively with the help of three tables and accompanying text, using color hues and saturation in the table to encode improvements or declines and countries better than or worse than India. For example, the fewer the number of countries doing worse than India, deeper the saturation of red to imply that does not bode well for us.

Table 1: India's Scores In 2019

Screen Shot 2020-09-22 at 4 13 30 PM


Table 2: Change in Scores since 2014

Screen Shot 2020-09-22 at 4 22 48 PM


Table 3: Comparison of Changes in 2008-2014 and 2014-2019

Screen Shot 2020-09-22 at 4 22 48 PM

Comments Current Data Visualization and Story:

  1. Table 1 shows our scores for 2019 and how many countries that have done better or worse. That may give us an idea of how India has fared compared against other countries but isn't instantly comprehendible since these are ordinal ranks. A table does not do the best job of easily communicating this information. Moreover, by not providing data of more countries, we are not really sure how good or bad India's score of 6.90/10 is.
  2. Tables 2 and 3 could be merged with each other in one graphic, if the number of countries that have done better or worse than India is omitted. Instead, changes can be shown compared to a country similar in its political dynamics in the same time period.
  3. The article considers two eras, 2008 to 2014 and then 2014 to 2019. These eras had in power two different governments and vastly different conditions surrounding them. The data story in no way tries to highlight or indicate this.
  4. No indication on whether we are considered a flawed democracy or a full democracy (which is highlighted by the tables in the EIU's Democracy Index)

Comments on the Dataset

  1. It would have been nice to see where exactly India lies on the spectrum of all the countries ranked from best to worst.
  2. It would also help to know how other countries that Indians are familiar with (Bangldesh, Pakistan, etc.) fare on the Index and Rankings. This would provide us an idea of how bad is 'bad' and how good is 'good.'
  3. We currently do not see how the trends in. Currently, Table 3 gives the impression of direct jumps from 2008 to 2014 and direct declines from 2014 to 2019 though there may have been rises and falls in the intervening years.

Initial Sketches

My initial ideas addressed the lack of representation of ordinal data of the countries ranks and the absence of anchor countries in the spectrum that would give readers an idea of what the Democracy Index Scores mean compared to good democracies and authoritarian regimes. The second visualization, for example, would allow people to compare India's progress over the years against the US giving them an idea of how well or poorly we have performed against a democracy similar to ours.

IMG_0521

IMG_0519

IMG_0520

I did not go with the third visualization of rank changes since accommodating such a large visualization of changes in ranks of so many different countries would draw too much focus towards itself and take away from the primary focus on India's performance through the years.

Instead, I chose to focus on the first two ideas and also include India's trends along different parameters over the years (and not just a simple trend line visualization of their overall Index score.)

Data Compilation

I compiled data firstly of the ranks of India and countries such as Norway, Germany, China, etc. in 2019. Then I also scoured the available indices through the years for the scores of India and the US across the different parameters for the second part of the visualization.

Reasons for comparison against the US:

Rank Data

Screen Shot 2020-10-26 at 2 13 15 PM Screen Shot 2020-10-26 at 2 13 01 PM


Performance of India and USA through the Years

Screen Shot 2020-10-26 at 2 12 36 PM

Categorization Data for the Visualization

Screen Shot 2020-10-26 at 2 11 57 PM


I then generated graphs of India and USA's performance through the years with Datawrapper. These were exported as PNG and used as an underlay to trace over in Illustrator. Illustrator provided more control over highlights, annotations and font sizing for better readability in the final design of the graphs. One such example of a simple graph from Datawrapper is below.

Screen Shot 2020-10-26 at 2 30 03 PM


First Iteration

In the first iteration, I represented India on a continuum of dots that represented the ordinal rank data of the countries and arranged the graphs of India and US's trends vertically below that. The problem with this visualization was that using circular dots to represent all 167 countries made legibility an issue and the vertical arrangement of graphs made the second component of the visualization difficult to read.

Screen Shot 2020-10-26 at 2 41 15 PM

Component 2 Smaller

Final Visualization

The final visualization fixes the issues pertaining to the exclusion of other countries that would have given readers anchors for the scoring out of 10 and contextualized the ranks within the different categories of regimes. Another major change was the inclusion of annotations, especially in changes of power between the INC and NDA.

The visualization is divided into two components:

  1. Rank data of India compared to 2008 and 2014 on a spectrum indicating ranks of other countries
  2. Performance of India across various parameters with the US for comparison
Assignment 1 Final Half

Design Changes in First Component (Rank Data)

Design Changes in Second Component (Trend Comparison)

Final Thoughts

The visualization was an insightful one though I have tried to be a little more direct between the INC/NDA split, which can also be gleaned from the visualization which shows that we have done significantly poorly since 2014. While the original author may have chosen to remain apolitical by not highlighting this contrast, I have chosen not to.

Lastly, this visualization may be made even better as an interactive visualization in which individual points in the trend graphs could have popups of key events instead of annotations as they currently do. Similarly, the users could hover over the ranks and see which countries lie at the selected rank.

A higher resolution version of the visualization may be accessed here.