info-design-lab / Plaksha_Dataviz

0 stars 0 forks source link

Assignment 3: Redesigning The Hindu Data Point Stories #3

Open venkatrajam opened 2 years ago

venkatrajam commented 2 years ago

For this assignment, we'll use data stories from The Hindu Data Point.

Select a story that you like, study it carefully and redesign it. Specifically I want you to focus on understanding the data that powers the story, and how it is visually encoded to tell the intended story. Document your design process, capturing the following:

What is the story the author is trying to tell? What the data he/she is using to tell the story? Describe its details -- type of data, extent of the data, dimensions of the data, gaps in the data, what data is essential and what is irrelevant. How is it encoded, problems with it and how you attempted to improve it. You may choose to expand or curtail the scope of the data used in the story, or add an additional dataset to tell the story better. But do not deviate from the main intent of the original story. In other words, it is a redesign exercise, and hence I do not want you tell a different, unrelated story.

While you should provide a link to the original story, it might be useful to capture and display inline, appropriate parts of the original visualization, and your own design iterations to produce a coherent documentation.

For reference, you can refer to here and here for what other students did with this assignment.

sehaj1001 commented 2 years ago

Name: Sehajpreet Kaur Original Article: https://www.thehindu.com/data/data-only-1-in-4-teachers-in-india-trained-to-teach-online-classes/article61441065.ece?homepage=true

Given the transition to online learning during COVID, the article brings to light the dismal number of teachers in India who are prepared to take online classes as measured by whether they can operate a computer for teaching. The data compares the readiness of teachers across dimensions of Management of Schools (Government, Government-aided, Local body, and Private unaided), Education Level (Pre-primary, Primary, Upper Primary, Secondary, and Higher Secondary), and State. The data used is quantitative data capturing the percentage of teachers who are trained to teach with the help of a computer.

Both visualizations used in the article to put across this story are heat map charts. There is a strong focus on the exact numbers and not the patterns or comparison across dimensions and important insights such as the discrepancy between the training of government and private school teachers get lost. The visualizations are also tedious to interpret since there are no aggregate metrics. In the one place the average has been shown, it has been incorrectly named as 'total' which is misleading.

The first visualization was as follows:

I have recreated this using a simple bar chart which first conveys the comparison across Management and Education Level and then also provides numbers for reference. I have also added an average line within each Management sector to understand which Education Level needs to be more focused on to bring it up to par.

The second visualization was as follows:

I chose to recreate this using two different charts. The first is a map of India colour-coded state-wise based on the percentage of teachers trained for laptop usage across all schools in India. The idea behind this chart is to give the reader an overall picture of which states or regions of the country are performing better without much focus on actual numbers. Such a chart is more likely to capture the attention of the reader as opposed to a dry table with numbers. It also helps to capture geographic patterns across the country.

The second is again a bar chart that helps compare numbers across states for Government, Government-aided, and Private Schools. The average line gives an immediate picture of Private Schools performing better than the rest. The length of the bars for the same state can also be compared since the charts share the Y-axis which helps shows discrepancies in this trend in states such as Gujarat where Government school teachers are better trained.

While the headline of this article was catchy, the shock did not come through in the visualizations and had to be explained in detail through the text accompanying them. The above visualizations assist in this interpretation much better as opposed to the tables.

rishav-gupta commented 2 years ago

Name: Rishav Gupta Original article: https://www.thehindu.com/data/terrorism-related-civilian-deaths-in-jk-cross-two-year-high-in-october-2021/article37233791.ece Dataset: https://www.satp.org/datasheet-terrorist-attack/fatalities/india-jammukashmir

The article talks about how the number of civilian deaths killed in terrorism-related incidents in Jammu and Kashmir crossed a two year high towards the end of 2021 after a relatively calm period. The first dataset consists of quantitative, time series data describing the number of incidents of killing and the associated fatalities across three groups (Civilians, Security Forces, and Terrorists). The second dataset is more detailed quantitative data mentioning the number of fatalities across districts over the years.

The article uses two visualizations to depict this story -- the first being a column chart showing the number of killings and fatalities and the second being a heat map table with the district wise share of fatalities.

The first visualization was as follows:

While the chart is annotated well, the only pattern coming across well is the one highlighted. It is tough to interpret the actual numbers and infer the relation between the incidents of killings and fatalities among the different groups. The different Y-axes also add to the difficulty of interpretation.

I decided to represent this data with the help of two visualizations. The first one is focused on the distribution of fatalities amongst the groups as a percentage of the total (relative comparison) along with the absolute numbers to understand the trend over the years. The second one is focused on the exact numbers for an in-depth analysis using bar charts. This beings to light interesting insights such as how the number of civilian deaths in the beginning of 2022 is already nearing the total in the last few years.

graph1 graph2

The second visualization was as follows:

The heat map seems like an overload of numbers and it is hard to find out the districts with high numbers. It is also difficult to represent trends over time in a table while working with time series data as in our case.

I chose to represent this data with a line chart which clearly captures the trend across the years. It is immediately evident that killings in South Kashmir have shot up drastically(after 2016) which did not come across with the table. I have also included a bar chart for the last 5 years along with this to explore the numbers in detail.

Finally, I created an extra visualization which brings out the overall picture of the total number of killing incidents and fatalities using a simple bar chart.

sarthakvarora commented 2 years ago

Name: Sarthak Arora Hindu Article: https://www.thehindu.com/data/data-excess-rain-extreme-heat-hurts-mango-lemon-production/article65366780.ece?homepage=true Dataset Link: https://mangifera.res.in/indianstatus.php Report Access: https://datastudio.google.com/reporting/0aeefb16-d8bf-4531-b32e-2b5c1cc8f66e

As per the Hindu Article, there was a significant drop in mango and lemon production due to extreme rains and heat. And the corresponding visualisations depict production figures and changes over the years. I relied on this official Mango production database, but it does not have data for 2017 onwards, the timeframe when our article is based.

In the article, the author not only talks about the drop in Mango production value, but also tries to dig into the reasonings as to why the production fell, and what measures could improve these conditions. They use data for changes in pricing during the time period, for different mango varieties. They also use data from a survey conducted on farmers on what measures are needed to help in right cropping decisions/better price realisation.

0. Dataset

The data exists in a tabular format, with some modifications to the nominal metrics - Production and Area so that they are more readable. The Area is measured in ‘000 ha, Production in ‘000 Metric Tons and Productivity in MT/ha. For the first and third table there is a categorical column on which the values are aggregated - namely the fruit names, and the state names. The second table has a column for datetime in a year format. For me, the values for % of Total Fruit Area and % of Total Fruit Production were irrelevant.

The data was copied to a Google Sheet which was then connected to Google DataStudio. To enable the date field comparison, the year had to be converted into date. Also the state names had to be converted into abbreviated convention set-up for visualization, like "IN-KA" for Karnataka. I did this using Vlookop with a Wikipedia page. Also I created a Pivot table to enable better visualizations:

The dataset has three tables, one for production for different fruits, one for year-on-year mango production, and one for statewise mango production. I ended up creating a comprehensive dashboard to create a narrative for the production:

1. Comparing fruits

To build a narrative, we first compare the performance between different fruits to see how mangoes compare to the general growth/drop in production figures. This is done using two metrics - Production and Area, and a derived metric - Productivity (Production/Area).

Author's Visualization:

My Visualization:

Story:

Here I have only taken the top four fruits in terms of production to compare the year-on-year change in production for different fruits.
Using the colour scheme yellow and orange connects with the viewer, as it talks about how the visualisations are for mangoes.
Adding all the prerequisites regarding the metrics used in the report helps create consistency, and ensures there is no loss in information.
Inverting the graphs and adding visual cues for the fruits makes the communication clearer, instead of the viewer looking for labels of each fruit.
The area and productivity charts show clearly how Mangoes are the biggest area consumers but give the lowest productivity among the top four fruits.
A donut chart to depict productivity compares the values well, and a treemap for area data gives an idea about the proportion of the sizes of areas that each fruits consume. Using an arrow for YoY productivity brings focus to the primary fruit we are analyzing - Mangoes.

2. Year on Year Production of Mangoes

Once we know how Mangoes fared in comparison to other fruits, we can double click on it and compare this performance over different years to gauge how the mango production changed over the years.

My Visualization:

Story:

A bubble chart is the best way to show all the three metrics on the same chart. I decided to show area and productivity on X and Y axis because productivity is dependent on area, while production was used to depict the bubble size. I would have used 15 different shades of yellow if it were possible. This data is highly inferential since we can see how in 1991 the yield was higher with lesser land available, and how in 2008-09 the yield was bad despite the size of area available.
This was again depicted in a bar graph, where I have used two Y axes, one for area and another for productivity to show how the rise in area doesn't always correlate with productivity, and it easily pinpoints the years with bad mango production.

2. CityWise Production

Once we know what years saw the best and worst performance for mango productivity with respect to the mostly consistent production, we can double click on it further and see which Indian states were responsible for the rise and fall in mango productivity over the past years.

Author's Visualization:

My Visualization:

Story:

The visualisation used by the author of the data source is a 3D pie chart and a bar chart split over each state for both metrics production and area, which does not do make sense given the vast number of bars/pie slices which would lead to less readability.
Also since we have dived into production before, we don't need to graph that. Given if we display the visualisations for area and productivity (a more important variable to understand efficiency), we can get an idea about the production, while the converse might not always be true.
I have thus created a new metric for change in productivity over the past year (which would ideally have been the timeframe of the article) and mapped it with the value of productivity, with a threshold line at 0% change. Here we can make out that the top three productivity states did not see a change, while the fourth top state MP saw a 7.5% drop in productivity.
The size of productivity is important, since if productivity is small, a small change in productivity might seem big in %s, and would ruin the implications. In our graph Himachal saw the highest increase in productivity but since the initial value was small, it is not a significant difference.
For the state data, visual representation on a map works best. To depict productivity, I use a heatmap. And to give more insights into productivity values, I use the filled map to depict area, which is more intuitive to depict area availability.

akash-chowdhary commented 2 years ago

Name: Akash Chowdhary Original Article: https://www.thehindu.com/data/data-no-sexual-violence-survivor-contacted-a-lawyer-only-47-took-police-help-in-2019-21/article65419734.ece

The article talks about the source of sexual violence among married women in India based on the data collected by the National Family Health Survey. The author tries to bring out the origins of sexual violence, who are the most affected, and commonalities among survivors.

The intent of the article and the visualizations is to leave the reader with an impression that survivors do not seek help in marital rape. It also brings to light the fact no or minimal legal recourse is sought in these cases.

Dataset:

The visualizations are presented for two data series - the source of crime and the source of help. The data is in ratio form and represented in the form of tree maps. The ratio of the datasets in both the visualizations adds up to more than 100%. The data was reconstructed from the tree map.

Visualization 1: Source of crime

This visualization depicts the source of sexual violence. Since the chart was interactive so a static image does not depict the absolute values of the data. It is also noticeable that the label is not readable for lower percentage values which makes it difficult to make a comparison. The colors depicted have no significance. The box depicting brothers and other relatives has too many labels to interpret - it can be replaced with others.

I have used a cluster bar chart to bring out the comparison between different sources. It also highlights the fact that current husbands are the largest perpetrators of sexual violence. The data labels help with readability. I have used a single color pattern and replaced multilabel data with others as a category.

Visualization 2: Source of help

This visualization depicts the source of help sought in case of sexual violence. The author intends to point out that legal recourse was sought in very few cases. However, the values can be rounded off for readability and cognitive load. It is also noticeable that values add up to more than 100%.

I have used a clustered column chart and chose not to represent the data values. The intent is to allow the reader to interpret through comparison and use length and relative axis values.

Visualization 3: Summary of the story

Also chose to use a cluster bar chart with data labels to summarize the story - 95% of perpetrators were current or former husbands and 90% of survivors did not seek any form of help.

sundaramgupta commented 2 years ago

Name: Sundaram Gupta

The story: https://www.thehindu.com/data/data-no-sexual-violence-survivor-contacted-a-lawyer-only-47-took-police-help-in-2019-21/article65419734.ece?homepage=true

Link to the actual report of the survey NFHS-5: http://rchiips.org/nfhs/NFHS-5Reports/NFHS-5_INDIA_REPORT.pdf

Overview: The author tries to tell how a major population of women who undergo sexual violence identify their husbands as to be perpetrators. Out of these, only a small number of women seek help that too from family and friends and not from the police in a legal manner.
Most of the women are afraid to say ‘No’ to their husbands when asked for sexual intercourse because they get afraid that their husbands will get angry and they will either stop helping them financially (given only a small number of married women are employed as compared to the men), will reprimand them, or will do it forcefully.

The Data: The data is taken from NFHS-5 (National Family Health Survey) 2019-20. NFHS is a large-scale, multi-round survey conducted on a representative sample of households throughout India. NFHS administered three types of questionnaires: the Household Questionnaire, the Woman’s Questionnaire, and the Village Questionnaire.

Visualization Used: a_2 a_1 Here, they have used a treemap to depict the source of crime and source of help. A treemap might not be a very good choice for a data like this because of the following reasons:

Labels are not visible properly without hovering
Absence of a baseline
Comparing areas is difficult

Redesigning: I have used bar charts to display, in addition to the above-mentioned information, how the percentage of women facing violence increases/decrease given the number of children that they have and if they are coming from Urban or Rural areas of a city. soc source_of_help child rural_vs_urban

Aviral0 commented 2 years ago

Name: Aviral Jain Article: Can teenager Carlos Alcaraz challenge the ageing champions in 2022 French Open?

The article featured in The Hindu on May 13, 2022.

What is the article all about?

The article talks about the age of Grand Slam title winners in the open era. The author makes the point that the age of the Grand Slam winners has been increasing with time, for example Nadal (36), Federer (40) and Djokovic (35) have won 80% of all the grand slam titles since 2003, and for a decade now, no player under 21 of age has won a grand slam. Therefore, the author is trying to convey through visuals that it is getting almost impossible for young players to win titles now which was not the case up until 2006.

Below depiction is used by the author, to give an idea how age of the title winners has shifted over the years.

The chart depicts the age of male Grand Slam winners in the Open Era. The higher the circle, the older the winner.

Issues with this visualisation:

• Poor choice of visualisation chart, the shift in age is not coming out clearly, I still think players below 24 years of age have fared better than 24+ age group over the years, there is no way to tell, unless I manually count the dots

• The colour bar of age isn’t helping much, the colours does not immediately give an idea about the age, one would have to refer to the y-axis ticks

• Also, since this article wants to bring out that “older players are doing better” the colour for older players should have been darker, currently the shade of the colour contradicts the author’s case

• Lastly, tracking the of age of only title winners in isolation is rather incomplete, one should also look at the age of all the players that participated, the starting point could be for example, from round of 32 onwards. The hypothesis is that what if the participation from young players itself has dropped due to which the title winners have reduced from them? And older age prevails at top

• If a macro analysis of popularity of tennis over the years could be included along with age of title winners that would make it more interesting, as it will rule out or accept the possibility that the age shift is not (or) because of teenagers no longer participating in tennis

Re-designing while not tampering with the essence

I believe that plotting the participation rate along with the no of winners is the best way to bring home the point that the age of title winners is shifting to higher age groups. I have created a dataset on my own for demonstration purposes only. The procedure is as follows:

I have taken the age distribution from round of 32 onwards for each grand slam
There are 32 players in round of 32, but I have assumed there is a pool of 60 unique players who have participated in different grand slams
I have taken the period from 1990-2013 to keep the graph neat
Since the dataset on age distribution of participants in all grand slams over the years is not available, I have randomly assigned no of players in 4 age groups (18-22, 23-26, 27-30, 30+) such that the sum is 60 (see below) and
I took the age of grand slam winners from this dataset

Finally, I wanted to use combo graphs, bar chart depicting the no of participants and scatter plot to show no of title winners but fun fact! You can cannot edit combo charts in office 365 Excel. So I have to improvise. I plotted the no of participants using bar chart and written the no of title winners over each bar chart.

Observations:

One can see the participation and no of title winners in a 5 year period
We can see that participation has been mostly consistent
Title winners in younger age group as indeed fallen, for example in 1990s there was 12 title winners, while in 2010-2013 there are none, the age group most consistent has been 23-26
Post 2015 the 30+ age group has tremendous no of title winners, referring to the original graph from the article, there were 14 title winners in 30+ age group which has never happened in that age group ever, but one can be sure it is not due to (or because of) lack of participation from younger players

It adds to the greatness of Nadal, Federer and Djokovic who have defied age and continue to dominate at the top. That's it for now !

pragun445 commented 2 years ago

Name - Pragun Aggarwal

Link to the article - https://www.thehindu.com/data/data-no-sexual-violence-survivor-contacted-a-lawyer-only-47-took-police-help-in-2019-21/article65419734.ece?homepage=true

The main crux of the article is that a large number of Indian women undergo sexual assault in some way or the another. Most of the crimes go unreported because a majority of them who are scared of their husbands, that they might not help them financially, do not file such cases or report to the local police. The article also brings in the light of the mentality of the patriarchal society where only men are seem to be dominant over the opposite sex.

Visualisation 1:

The visualisation is a tree chart depicting the various percentages of the people who forced them to have sexual intercourse acts with them. The problems with the tree chart are:

No defined labels are there to support the information.
Comparing areas is difficult because of the absence of designated areas for the graphs.

Redesigning:

I have redesigned the model with the help of bar and column charts. The following visualisations stand a testimony to it.

lohkna007 commented 2 years ago

Name: Gaurav Lohkna Assignment #3 Original Story: How rising space debris will impact ISRO’s budget

What is the story the author is trying to tell?

Vignesh Radhakrishnan is attempting to bring to the reader how space debris is affecting the ISRO budget and how it will impede future space missions in India.

Also, the author tries to press the issue of how the US and USSR created the space junk mess and blamed India for its limited involvement in creating space junk from anti-satellite weapon strikes (ASAT) back in 2019.

What data are they using to tell the story?

Several graphs are used in the story including one that depicts the space race, another with debris, and another showing the cost of path corrections. In spite of the author's attempt to give a clear impression of data, it has some design flaws that must be corrected in order to tell a more convincing story with a clear visual representation of data.

space race

Debris/objects in space

Country-wise debris contribution

Number of corrections

How is the data encoded?

space race: The chart graph represents two lines: yellow, the payloads in a year, and blue, the launches in a year.

Debris/objects in space: This chart graph illustrates the increase in different types of space debris over the years, with different colors indicating different types of debris.

Country-wise contributions: In this visualization, boxes represent the total number of debris in space by country.

The number of corrections: This chart shows a trend of increasing numbers of collision avoidance maneuvers performed by the Indian Space Research Organization, ISRO, to bypass orbital debris.

What are the problems with the encoding?

Despite the fact that all the charts are correct, it can be seen from the graphs above that they are not intuitive to interpret and require the viewer to put effort into them in order to get information for a particular year.

space race: In this chart, the axes are not labeled and the lines are difficult to read until the user hovers their mouse over a particular line.

Debris/objects in space: The legends for different lines are scattered all over the graph instead of having it in a place, the x-axis is not properly labeled with and thus hard to interpret for a given year.

Country-wise contributions: There is an inconsistency in the visualization since it has displayed numbers for only the US, USSR, and China, not for all countries, and users have to work hard to understand the total debris created by countries other than the US, USSR, and China.

Cost of correction: Although this chart is clear to interpret, however, it needs labeled axes.

How did I attempt to improve it?

As I mentioned the data is correct and I need to improvise how it has been presented using this graph charts. In most cases, charts are missing properly labeled axes, and legends are positioned incorrectly in all of them.

Space Race: Instead of having line plots I used numbers inside the bubbles to see the comparison between the number of launches and the number of payloads carried in a particular year. As it can be seen that both payload and number of launch has a linear relationship. However, after 2018 it has changed drastically as we improve the efficiency of the space vehicles carrying payloads.

Collision avoidance correction: Here I have again used a simple label with value inside a bubble so that a viewer can easily interpret what the data is saying and as we can see that there is a linear pattern in the number of corrections maneuver performed with each passing year.

Country-wise debris contribution: This pie chart shows the percentage of contribution with numbers alongside each space country/agency and it is now very clear rather than having just numbers and figuring out which country has produced the most space junk/debris.

ESA: European Space Agency, TBD: To be decided, ITSO: International Telecommunications Satellite Organization

Insights:

As in the space race chart, we can see that there used to linear relationship between the number of launches and the number of payloads, however, after 2018 the number of payloads is increasing multifold of the number of launches. This can be due to the efficiency of SpaceX’s reusable space vehicle.
The collision avoidance path correction chart shows that in the near future India (ISRO) might need to incorporate more corrections as the junk in space will keep on increasing. Hence, ultimately, ISRO might have to allocate more budget for fuel and hence reduce the payload capacity of the vehicle.
The chart of country-specific debris contributions shows that mainly three countries have produced more junk than the rest. As well, it can be seen that the stronger the country, both in terms of military presence and weaponry, the heavier the space junk contribution.

datababa1 commented 2 years ago

Name - Apratim Chandra Singh

Title of Article - 2022 Uttar Pradesh polls: BJP’s performance in the seats won in 2017

Story the author is trying to tell

The article paints a view of the comparison of the parties in the Uttar Pradesh Assembly Elections of 2017 and 2022 by contrasting the relative performance of each of the parties and allies across different metrics viz. seats led & vote share%. In a sense the author is trying to paint how each of the participating parties have fared across different assembly elections and hence offer a view into the dynamics that transpired in the 5 years

What data has been used to tell the story

Instead of utilizing any graphical representation, the author simply communicates the story using tabular form of data

The examples for the same are given as below :-

1) Seats led Data

2) Vote Share %

How is it encoded & problems with it

Since the data is encoded in the form of tabular representation, the imagery of comparison amongst the parties is not clear. Although the representation in itself is simple but since humans are visual beings - adding a layer of visualisation will help to present the narrative succinctly

Also since the representation in a tabular format, effectively several readers might give it a miss since many of them may not follow the intended story

The other aspect that is lacking is the clear headlines in the data that allows the reader to form coherent message.

My attempt to improve the visual representation

Since the composition of allies change across every assembly elections. I have limited the graphical representation keeping only 2022 Assembly Election as the central focal point

1) Seats Led Data

I have represented the tabular data into a bar chart so that it showcases a comparative view of the seats across alliances. Also I have visually coded them as per the alliance color thus allowing the user to have a visual cue

2) Vote Share%

Since the vote share percentage is represented part of whole and as a %. I am representing the metric in the form of a pie chart so visually denote it as part of the pie and using the same color combination as used by the parties in order to showcase visual cue to the reader

Observations

One can see that by transforming the tabular data into a visualisation allows the reader to understand the content of the article in a much more visual manner. It also allows the user to see the comparison by the method of the graphs and chart instead of merely the numbers

abhishekanandiimr commented 2 years ago

Name: Abhishek Anand Article: How many Indians eat meat? Article Link: https://www.thehindu.com/data/data-how-many-indians-eat-meat/article65299234.ece?homepage=true Referred data: National Family Health Survey-5 (http://rchiips.org/nfhs/) and https://data.gov.in/search?title=National%20Family%20Health%20Survey-5

What is the focus of the article?

Based on data from the National Family Health Survey-5, this article examines Indians' non-vegetarian dietary patterns. It examines the state-by-state consumption pattern of non-vegetarians. According to the article, in over half of the 30 States/ UTs analysed, more than 90% of the population consumed fish or chicken or meat daily or weekly or occasionally. In 25 of them, the figure was more than 50%. In none of the States/UTs was the share less than 20%.

Data We have Following data as per the article's reference Partial data is shown

Visualization used

This article only utilises one sort of visualisation: a map with different percent ranges of colour coding. The range has been divided into five buckets, resulting in five distinct colours. When we hover our cursor over the map, we can observe the percentage of people who consume fish, chicken, or meat on a daily, weekly, or occasional basis.

Issue with Visualization

The article uses only one type of visualization for each category of Non-Vegetarian food. Visualization is of only one dimension.
It doesn't consider the frequency of consumption of Non-Vegetarian
It doesn't show how Men and Women's consumption differ
Comparing how does each category contribute to overall non-vegetarian consumption
no analysis of region-wise consumption pattern

Redesigned Visualization

We need to look at the data to rebuild the visualisation, and the data has different frequency of non-vegetarian eating patterns, such as daily, weekly, occasionally, and never. This is a crucial parameter to consider because it indicates how often the populace consumes non-vegetarian meals. Also, what are the differences in consumption patterns between men and women?

So we'll start by looking at different types of consumption patterns in the general population, as well as in men and women.

Consumption patterns of the total population

Men's Consumption Pattern

Women's Consumption Pattern

Now we'll look at the consumption patterns by state. We'll start by plotting a bubble plot on a map as the overall size of the percent of the population that consumes non-vegetarian food. This figure improves on the map provided in the article by displaying the relative size of the bubble, which reflects the percentage of the population that consumes non-vegetarian foods right away.

Consumption patterns by State/UT

Consumption patterns by percent of total population, percent of men population, and percent of women population by State/UT

Consumption patterns by region, as well as different forms of consumption

State/UT by region

Conclusion:

The data is visualised in a new way that highlights many features of the data and analyses the population in different patterns before moving on to state-by-state consumption patterns and finally to region-by-region analysis.
According to data,only 4 states with >90% population consuming non vegeterian, 9 state with >80% population consuming non vegeterian and 23 state with >50% population that consume non-vegetarian foods.

jaisal1497 commented 2 years ago

Name- Jaisal Chaudhary

Article- COVID-19 cases surge in rural India even as vaccination rates are lower than urban areas

Article link- https://www.thehindu.com/data/data-covid-19-cases-surge-in-rural-india-even-as-vaccination-rates-are-lower-than-urban-areas/article34607195.ece?homepage=true

Story the author is trying to tell

The author compares the share of % of rural cases with % of urban cases. According to the article, When the second wave of COVID-19 cases in India was on the wane, the share of cases in rural districts had started to increase. However, this rural surge was not uniform across all States and was more pronounced in Uttar Pradesh and Maharashtra. The rural-urban asymmetry also exists in vaccination coverage with the rural population having relatively lower numbers for vaccinations in each State analyzed. The author claims that though rural infrastructure to administer vaccines is in place, the supply of doses is still skewed towards urban areas. There is no data in the visualizations or the article to support the claim of supply shortage in rural areas.

The author compares the data using a highlight table for all 4 visualizations.

Figure 1:

The above table displays the share of rural cases (%) across 8 states from January 2021 to May 2021.

Thoughts on this visualization:

In a highlight table/heat map the person reading the chart is at liberty to interpret 10% 'redder' or 'darker' to their own satisfaction. On top of that is the problem of differing abilities of people to discern colour and shade to begin with
The values are very close to each other in some cases and the gradient of the color does not change drastically (given its only 5 months of data) which makes it either difficult to discern or not adding any value by adding the colors. In summary, the trend is not clear just by looking at the numbers in one glance.

Proposed redesign:

Iteration 1:

To make the trend clearer just by looking, I decided to create a single line chart to display the data in highlight table.

iteration1

Problems:

The graph looks cluttered and busy, especially if adding labels it becomes very difficult to reading
All the states are not following a similar trend further making it look complicated. Hence, it makes little sense to visualize them in a single graph.

Iteration 2:

I decided to create a line graph for each of the state. This made it clearer however it increases the vertical scroll. It can be arranged in a grid like pattern for compactness however, could not do it with Tableau.

fig1

Another approach I tried was to group similar trend states together there were states like Gujarat, Karnataka, W.B following a similar pattern but again there were 3 groups. Individual graphs look much cleaner.

Colors are added to each State just for distinction, they do not carry any relevant information.

Figure 2 & 3:

This visualization captures the number of people who are half vaccinated per 100 people in rural and urban areas. This highlight table also contains a column of difference between these two columns. Similarly Figure 3 captures the number of people who are fully vaccinated per 100 people in rural and urban areas.

Thoughts on this visualization:

In the redesign I have decided to include only the fully vaccinated numbers. The reason for that are as follows:

Fully vaccinated stats are a much stronger indicator of number of covid cases.
There is a strong correlation between number of first doses and second doses. Even the trend is exactly the same for all the states in the two tables. Hence two different tables are not adding much value.

Considerations for the redesign are similar as mentioned above when selecting a line chart:

The highlight map is not telling a great story when it comes to negative numbers and encoding it via colors which do not have a wide range.

I have used an overlapping bar chart to visualize the differences between urban and rural in this case.
As number of vaccination in rural areas is always lesser than urban areas, it acts as a subset and is displayed by the thinner bars inside.

With this the viewer can quantify the difference based on the vertical height of the bars.

Note: In this case the area of the bars does not signify anything.

Figure 4:

This visualization captures the number of vaccination sites per 1 lakh people, in urban and rural areas. The author has used a similar highlight table to signify the difference.

Redesign:

1.The data is mostly positive here and in 5/9 cases the difference in magnitude is less than 1

I could have used a chart similar to the above chart but as the difference is too small in many cases the vertical height won't be different, hence overlapping bars would be a poor choice I felt.

To tackle that, I have used a horizontal bi-directional bar chart with a symmetric x-axis. Where the difference between rural and urban areas is small the bar is almost equal.

fig3

JazBern commented 2 years ago

Name : Jasmine Bernard Article : India transitions from money purse to digital wallets

Story: The article discusses how the pandemic has accelerated India’s transition to a digital economy. From over 70% of point-of-sale (PoS) transactions done using cash in 2019, the share almost halved to 37% in 2021. Notably, the share of digital wallets in the mix has improved drastically from just 5% of PoS payments to 25% in the same period. The RBI’s newly introduced digital payments index also shows that digital payments surged in India, especially during the pandemic. However, the currency in circulation as a % of the GDP has crossed pre-demonetisation levels in India and is the highest among the economies compared.

Visualization 1 :

The chart shows the % share of point-of-sale payments done using different modes.

Issues with the visualization:

All the payment modes are coded in bright colours and it would be difficult for the viewer to focus attention on digital/mobile wallets.
With the legend at the top, it has to be looked every time to check which colour indicates which payment mode.

Redesigned visualization:

Darker shade is used to highlight digital/mobile wallets and lighter shades are used for the rest of the modes so that digital/mobile wallets get the attention.
Payment modes relevant to the story are stacked at the bottom and values are in bold font, making comparisons easier for the viewer.
For the viewer to easily distinguish the modes/stacks, legend is moved to the right of the chart and also arranged in the order in which they are stacked.
Buy now, Pay later is not a very common mode and hence was counted under the mode 'Others'

Visualization 2:

The chart is to show how India and China stand out for their preference for digital wallets.

Redesigned Visualization:

To show how India and China stand out for their preference for digital wallets, it is better to show the percentage use of digital wallets in the other countries. A colour coded world map with the percentage use of digital wallets in important economies gives the viewer a basic understanding of digital wallet usage across the world and how India and China is leading. The percentage use of other payment nodes are ignored as the focus is on how India and China are the top users of digital wallets.

Visualization 3:

Redesigned Visualization:

A simple line chart would be easier for the viewer to understand the increase in a continuous value over time.

Visualization 4:

The chart shows the cash in circulation to GDP ratio among select economies. While digital payments have increased in most economies, cash circulation has risen significantly after the pandemic. In India especially, the ratio peaked in 2020 and was the highest among the economies compared

Redesigned Visualization:

The horizontal background lines which could be confusing to the viewer have been replaced by vertical background lines.
There hasn't been much change in cash circulation between the years 2012 and 2019 and hence data labels are added only for these two years.
Data labels are added to years 2019 and 2020 to highlight the increase in cash circulation during the pandemic.
Just like in visualization 1, the legend has been moved to the right of the chart to make it easier for the viewer to read.

aravindbhaskar41 commented 2 years ago

Name: Aravind Bhaskar Article: How many Indian eat Meat?

The article was featured in The Hindu on April 7, 2022.

What is the story the author is trying to tell?

In this article, the authors discuss some cross-country debates and official & unofficial drives targeting vendors selling meat in India. The author highlights events like the south Delhi municipal cooperation mayor requesting the civic body commissioner to close meat shops during Navratri. The author also highlights other unofficial drives against non-vegetarian vendors. The author then explores the data from Nation Family Health Survey-5 (2019-20) and attributes that most of the population in the Indian States consume non-vegetarian food(egg, fish, meat/chicken) on a daily, weekly or occasional basis.

How is it encoded and what are the problems with it

The author uses different gradient maps for each item (egg, fish, meat/chicken) causing the user to look through different maps to understand the consumption of these items in a single state
The the exact % of the population consuming this item is not visible on the heat map
The heat map creates a divide among the states and can harm the unity of the nation
The author has to explicitly mention the number of states with < 75 and < 50 percentage consumption and isn't implicitly shown in the visualization

How I attempted to improve it without deviating from the main intent

Meat Consumption

Combined the consumption patterns of the items(egg, fish, meat/chicken) into a single chart for easy comparison across the three items for a single state
Different colouring to highlight states with < 75 and < 50 percentage consumption
The chart makes sure not to create a divide among the states but instead highlights the fact that most of the Indian states are majorly non-veg consumers
Used a three-color scheme, which can be easily differentiable, throughout the visualization

yasirulhadi commented 2 years ago

Name: Yasir ul Hadi Article: Only 8% of children in rural areas studied online regularly in August 2021 Article link: https://www.thehindu.com/data/only-8-of-children-in-rural-areas-and-25-of-children-in-urban-areas-studied-online-regularly-in-august/article36403488.ece?homepage=true

What author is saying:

According to the latest report by ASER in rural Karnataka, the share of Class 5 students enrolled in government schools who could read Class 2-level texts came down from 47.6% in 2018 to 32.8% in 2020

Article depicts impact of covid-19 on students and how whole learning process have been changed in past 1.5 year. Data is present in tabular form which is making it difficult to identify the underlying hidden patterns.

The school survey covered nearly 1400 underprivileged children in August 2021 across 15 states and UTs and below are the results of the article.

The major problems for children who didn't study online regularly were the lack of online material or the unavailability of a device. As many as 43% of parents in rural areas said no online material was sent by the school, while 36% said their children did not have their own smartphone. Among those children who studied online, the majority of them said that they faced connectivity issues and found online classes difficult to follow.

Also it shows huge learning loss which impacted the reading ability and arithmetic ability in students if we compare it from 2014 to 2020. According to the latest report by ASER in rural Karnataka, the share of Class 5 students enrolled in government schools who could read Class 2-level texts came down from 47.6% in 2018 to 32.8% in 2020. Similarly, the share of such students who could do subtraction decreased from 52.5% in the same period.

Redesigned Representation: Used vertical bar chart representation to show each data category with frequency distribution of all children studying in different ways in August 2021

Used line chart to understand range, cluster, and minimum/maximum of students who studied regularly, time to time and did not study at all.

Used stacked column horizontal bar chart to display the relative percentage or proportion of the learning roadblocks suffered by the students during online learning

Used clustered column horizontal bar chart to display the drop of reading ability in the students from 2014 to 2020

Used clustered column horizontal bar chart to display the drop of arithmetic ability in the students from 2014 to 2020

ranjan8manish commented 2 years ago

Name: Manish Ranjan Article: How regional parties in India are funded? The article featured in The Hindu on December 07, 2021.

Overview: The article talks about the trend of how regional political parties of India were funded during the FY20. It says that the total income for the selected regional parties in FY20 amounted to ₹803.24 crore, of which 55% was from unknown sources. ‘Income from unknown sources’ refers to those donations that are made without the details of the donors. These include donations made through electoral bonds, sale of coupons, miscellaneous income and voluntary contributions. On the other hand, when parties provide details of the donors to the Election Commission of India, such donations are referred to as ‘income from known sources. Below chart was used by the author to represent the above data:

Since the above chart did not show the break-up of income from different sources, I used the following pie-chart to show the break-up of income in detail from different sources:

In the next chart the author plots the total income of select regional parties in FY20 against the share from unknown sources using a scatter plot.

Issue with above visualization The problem with this visualization is that it is very difficult to map each circle with X-Y axis. Also, which party a particular circle represents cannot be determined just by seeing the chart. One has to hover over each circle to find out the name of the party and how much income it received from unknown sources.

So, I used a pair of bar-graphs for each regional party to clearly show their total income and what was the percentage of their income from unknown sources:

In the next chart author depicts the share of income from unknown sources, under various routes, received by the regional parties in FY20. The author uses the below chart:

Issue with above visualization: The problem with the visualization is that it does not clearly show the break-up of income from unknown sources under various routes. So, I used horizontal bar chart to represent the above data:

Summary: The data represented by the author through various charts was not informative enough. So, I used different visualizations in place of the charts used by the author to represent the data in an effective and informative way. Using informative visualization, I have tried story-telling in a lucid manner.

m-saiteja commented 2 years ago

Name : Sai Teja Muliki Article Link: https://www.thehindu.com/data/data-how-many-indians-own-a-fridge-ac-or-a-washing-machine-a-state-wise-split/article65526597.ece

Story author is trying to convey:

The author has depicted a mobility map for the various types of automobiles owned by Indians, broken down by state. Bicycle ownership was quite high in the country's eastern, northern, and north eastern regions, with Punjab leading the way in the two-wheeler category. People in North India and a few north-eastern states are known to own a larger percentage of automobiles.

Authors Visualizations: bicycle bike cars all_vehicles

Problems with the story: It's difficult to remember the color codes. Since multiple gradients of the same hue are utilized, it is difficult to locate the highest number or the numbers to group together.

Improvisation: I tried plotting it as a simple bar graph, which instantly shows which states had the largest numbers. Without having to seek for the numbers, it's also simple to keep track of the states that match the pattern.

improvised_bicycles imporvised_bikes improvised_cars improvised_all_vehicles

Conclusion: From the visualizations of the author, though the data can be understood by someone from India, it might be difficult for an outsider to understand it. Also, the color coding is not much explained and the numbers are pretty hard to compare as they are to be searched from state to state in the geo map. To solve this, I tried to plot a simple bar graph where the states data can be easily compared and draw conclusions.

josephbenofficial commented 2 years ago

Joseph Ben

Orignal Dataset

Pros of this visualization:

Looks good
Lots of countries mentioned, hence a lot to compare

Cons:

Information overload

My solution:

There is no need to mention all the countries. I feel there is only a need to mention the top 5-7. This will be enough to give the user an understanding of how much countries are dependent on Russia for their fossil fuel needs.

There can be three graphs to give more clarity. The problem with this is that it will take up more space in the newspaper.

04kaushal commented 2 years ago

Name – Kaushal Kishore

Article Name - Only 8% of children in rural areas studied online regularly in August Article Link - https://www.thehindu.com/data/only-8-of-children-in-rural-areas-and-25-of-children-in-urban-areas-studied-online-regularly-in-august/article36403488.ece?homepage=true

Tool Used – Tableau

Purpose of the article -

The author is trying to highlight the ill impact of online learning and how it has led to deterioration in learning outcomes. It shows how some students were able to cope us with the online mode of education while learning remains inaccessible for most. The data has been collected through surveys conducted by — School Children’s Online and Offline Learning (SCHOOL) and Annual Status of Education Report (ASER).

Correction Zero –

Here the author is trying to convey a story based on data, discussion has been around several aspects of survey. Before digging deep, I thought to demonstrate the most basic needs for online mode of education. I have used map to demonstrate the smartphone penetration & internet penetration in India. For smartphone penetration I have only represented the state with highest & lowest figures while for internet it is for all the states.

Smartphone penetration -

0_1

Internet Penetration

0_2 0_3

Fig 1: % of children who were studying in different ways in august

Actual Approach-

Author has opted for tabular representation dividing the table into rural & urban sectors and components into regularly & sometimes. The data is not intuitive & we cannot directly register the story it wants to convey

My Approach-

1_1 1_2

• As we are showing the data in percentage and for both rural & urban areas, we opted for stacked bar chart to demonstrate the proportions properly. • We observe from first chart that majority of children studied at home without help both in rural & urban areas and this was sometimes activity. • In regular events we observe private tuition fared highest in both rural & urban areas.

Fig 2: % of children who (did what)

Actual Approach-

Author has opted for a tabular representation here and represented separately for both rural & urban areas.

My Approach-

2_1 2_2

• Since we have only 3 broad categories & representation is as per percentage, we choose to denote it via a pie chart separately for both rural & urban. • Among the rural children only 28% studied regularly while others either did not study or occasionally. • Among the urban children 47% studied regularly.

Fig 3: Main reason why they did not study online regularly in households that had smartphone

Actual approach-

Author has represented the explanations in a tabular way

My Approach-

3_1 3_2

• As these were sentence-based options, I have tried representing it through tree map which assigns higher area to larger proportion. • For children from rural areas the main reason was “no online material was being sent by school”, while in urban the main reason was “Child did not have their own smartphone”.

Fig 4: Experience among children who studied online

Actual Approach-

Again, a tabular approach was followed by author to demonstrate experience.

My Approach –

4_1 4_2

• As these were sentence-based responses and had to be shown in proportionate form, I have tried representing them through bubble chart where bigger bubbles represent higher proportion and vice versa. • Most of the children (65%) had faced connectivity issues, while same issue was faced maximum by urban children as well (57%).

Fig 5: Representing Learning ability (Reading & Arithmetic)

Actual Approach-

To demonstrate learning abilities author has again used tabular representation & heat map has been used to demonstrate the maximum & minimum components. The tabular approach coupled with heat-map makes it a cluttered representation.

My Approach-

Reading 5_1 5_2

Arithmetic

5_3 5_4

• Since it is a time series data, we have used line chart for representation. This helps while comparing the trends as well. The data has been plotted separately for urban & rural. • While checking the reading abilities chart we observe a consistent fall in trends for private schools, the data for government school did improve till pre pandemic era but witnessed a sharp fall during pandemic. • While checking data for arithmetic capabilities for grade 5 students we observe a continuous fall for private while the government school did remain constant (decreased marginally) from 2016 to 2018 but continued to fall. Overall private schools performed better here. • While checking data for arithmetic capabilities for grade 8 students we observe government schools having a constant trend while private schools had continuous increasing trend and rapid increase around 2020.

utkg26 commented 2 years ago

Name: Utkarsh Garg Article Title: From paracetamol to stroke medicine: Check how much medicine prices increased this year Article Link: https://www.thehindu.com/data/data-from-paracetamol-to-stroke-medicine-check-how-much-medicine-prices-increased-this-year/article65311448.ece

The article talked about how NPAA [National Pharmaceutical Pricing Authority] has increased the ceiling prices of essential medicine by about 11% in March 2022.
The price rise has had an uneven impact. While the rise was relatively low for cheaper medicines, the more expensive medicines have recorded a phenomenal rise.
The article focused on the increase in WPI and retail inflation basis on which the NPAA sets the prices of the medicine. There is a sharp hike this year due to the hike in WPI

Visuals Used in the article:

1. WPI

2. Retail and Medicine Inflation

3. A searchable medicine list

Observations on existing infographics:

The article talks about the co-relation of WPI and medicine inflation but that relation is not perceived upfront by the readers
The retail inflation is mentioned to show that despite the retail inflation being under control, inflation of medicine has increased more, with no a strong co-relation.
A searchable table is given but the intent is to show the impact of price rise and uneven spread of the hike, searchable database is not required as the infographic

Revisualisaed:

Inflation Index

Medicine Inflation Index

The idea is all the inflation indexes are stacked on the same graph so that the reader can see how the trend is moving and what it mean that NPAA hikes the price due to an increase in WPI and medicine inflation.

Impact of uneven price rise Instead of the searchable database

The idea is to show that some medicine prices have been increased by 6k rupees while some have been increased by Rs 1/2 thus creating an imbalance.
This graphic also gives an impact of the price rise by employing different colors and sizes for the increase.

blessondavis commented 2 years ago

Only 1 in 4 teachers in India trained to teach online classes

Link to the Article: https://www.thehindu.com/data/data-only-1-in-4-teachers-in-india-trained-to-teach-online-classes/article61441065.ece

The story:

The COVID-19 pandemic forced schools to stop physical classes and shift to online teaching . This move has highlighted two concerns: how well teachers are trained to take classes online; and the pupil-teacher ratio, which determines the quality of education. According to UDISE 2019-20, only one in four teachers in India was trained to use a computer for teaching. The share of such teachers was even lower in government schools. Also, there were wide disparities among States, with Gujarat training 57% of its teachers while M.P. training only 9% of them. While most States had an acceptable pupil-teacher ratio, the ratio was above the recommended value in some States, especially in higher education. While a high pupil-teacher ratio is a concern even during physical classes, its importance has only increased with online teaching.

About the visualisation:

The table lists the % of teachers trained to teach with a computer in India. Only 15% of teachers were trained to teach using a computer in schools managed by the government. Though the share was relatively high among private schools, only one-third of teachers who teach higher grades were trained to teach online. The types of management listed in the table cover 95% of the teachers in the country.

Few limitations of tabular data for this:

The goal of the table is to compare the type of schools and the percentage of teachers not trained to teach on online platforms.
The secondary goal was to compare different stages like primary, secondary, and higher secondary. When there are many columns with different numbers, it isn't easy to visualise the important takeaways.

Here is my visualisation:

Showing the percentage in Progress Rings helps you visualise what is the proportion without looking at the per cent.
The small graphic to show "1 in 4" helps everyone get it in one glance making it effortless.
I have presented only the total percentages to make it easier for readers to remember the numbers.

devashreepatel commented 2 years ago

Name- Devashree Patel Article- https://www.thehindu.com/data/data-over-25-rural-households-defecate-in-the-open-in-contrast-to-swachh-bharat-data/article65422754.ece Data:- Dataset from National Family Health Survey - 5 (2019-21) has been used to visualize the story that the author is depicting.

Story: The author wants to convey that even if all villages in India's 36 states and union territories were proclaimed open defecation-free on October 2, 2019. (ODF), data from the National Family Health Survey (NFHS-5) just released shows that none of the 30 states surveyed had no open defecation.

The author has also compared the access to toilets varied widely based on caste and wealth.

1. Miles to go:

Through above graph the author is trying to tell the share of the househods that are not ODF or have no toilet facitity available. As per the graph, this share has decreased over years, but still there is a lot to go.

2. State-wise share:

In the above graph the author presents the details of share of households that were declared ODF by the Swachh Bharat Mission (Grameen) and share that were actually ODF. In this graph, the author has made a state wise comparison.

3. Variations on the basis of caste:

In the above graph, author depicts access to toilets among households across select castes.The access was much lower among the SC and the ST households.

My Visualization:

I believe that the graph depicting state-by-state shares of real and declared ODF percentages is not easily interpretable at a glance. Understanding the graph may take some time.
The dot plot graph does not gives clear understanding of the data.

As per me the following would be the redesigned visualization of the state wise share graph-

The split bars graph clearly shows the percentage of the actual and declared ODF for all the states.

taniadaw commented 2 years ago

Name : Tania Dawra Article Title: Only 8% of children in rural areas studied online regularly in August Article link: https://www.thehindu.com/data/only-8-of-children-in-rural-areas-and-25-of-children-in-urban-areas-studied-online-regularly-in-august/article36403488.ece?homepage=true Additional data link: https://counterviewfiles.files.wordpress.com/2021/09/locked-out-emergency-report-on-school-education-6-sept-2021.pdf

The article explores that physical classes have been suspended in Indian schools for nearly 1.5 years. While some students were able to study online, most were unable to do so. Data gathered by Two surveys: School Children's Online and Offline Learning (SCHOOL) and Annual Status of Education Report (ASER). In August 2021, only 8% of children in rural areas and 25% of children in urban areas regularly studied online.

Addition: The following visualization helps us understand the coverage of the survey :

Many states were not surveyed, as can be seen. However, the survey encompasses regions from the north, south, east, and west, and hence can be considered indicative of subcontinent patterns.

We consider each of the metrics presented in the article:

Different Modes of Study Original visualization My visualization: The idea is to clearly showcase the two categories- : Urban and Rural data along with percentage of children using different mode to study either sometimes or regularly. Bar Chart seemed appropriate due to its ability to contain many categorizations effectively.
Study Patterns across India: Original Visualization: My Visualization: In this scenario also, Bar chart acted as a right comparison to showcase the % of children and their study patterns. We can see that while 47% children in urban areas studied regularly , only 28% children studied regularly in rural areas.

We further explore the reasons behind children not studying.

Reasons for children not studying: Original Visualization: My Visualization:

This visualization tries to capture the difference each factor played in both rural and Urban areas. As these remain the grey areas/roadblocks in learning, the color was chosen accordingly. Also, I observed that the values did not add to 100 percent and hence there was a high chance of errors in data calculated. Error bars helped with the same.

Experience among children who studied online: Original Visualization:

My Visualization: We can see that that network problems area major issue in not only rural areas but also in Urban areas. It can be seen that children in urban areas watch live classes more compared to children in rural areas. This can be due to communication/ mobile usage being better in Urban areas than rural areas.

The article elaborates that even those children that were online had difficulty following the curriculum and experienced network challenges. As a result, the percentage of children who could read and calculate fell below pre-pandemic levels.

Reading Ability: Original Visualization: My Visualization: The scatter graph captures well the trend across time and helps to see clearly that reading literacy has declined across class 5 and class 8th students.
Arithmetic Ability: Original Visualization:

My Visualization: Here also, the scatter graph captures well the trend across time and helps to see clearly that arithmetic abilities have declined across class 5 and class 8th students.

Addition: A summary visualization helps to further enhance the key findings:

Thus, we can see that overall online education remains a fictional educational system With issues leading to bad experience, constant challenges and decline in literacy levels, it is no wonder that parents prefer the traditional physical classroom approach and want schools to reopen soon.

SmrutiShirodkar commented 2 years ago

Name: Smruti Shirodkar Article: Over 50% children in 30 States and UTs were anaemic in 2019-20 Source: The Hindu on Dec 11, 2021

The Story: The article shows the comparison of National Family Health Survey - 4 - 2015-16 (NFHS - 4) and National Family Health Survey - 5 - 2019-21 (NFHS - 5) where the numbers of anaemic children aged between 6-59 months have increased by 8.5% in the year 2019-20. Moreover, the proportion has increased in 29 states/UTs compared to the NFHS - 4.

Description of the data and the visualisations: The article is based on two National Family Health Survey data and used table, bubble chart and map to showcase the impact of the data across different segments like Prevalence in total, comparison for rural and urban, state-wise prevalence and change from NFHS-4 to NFHS-5.

Issues with the visualisations

Though the data points covered in the article provide valid arguments, it fails to have an impacting impression. The graphs and charts used could have been different for a greater visual retention. More detailed observation on each visuals is provided below:

Visual 1: Issues:
- A chart displaying just the numbers without any comparison or size painting of it, isn't giving a solid portrayal of the situation.
- The visual should also display the base count of populations (aged 6-59 months) for a better analysis.
Visual 2: Issues:
- A bubble chart can be used for correlation however, the purpose of this chart for the data point of comparison between Urban and Rural numbers is slightly lost with bubble chart.
Visual 3: Issues:
- According to the map, Ladakh shows the highest number of cases (though a UT) while kerala has the lowest. The map shows the data quite well across all the states, however, it would have better to display the data range in such a way that Gujarat is displayed well for highest number of cases. A correlation data would have been more impacting to showcase the effect in north and west compared to other sides.
Visual 4: Issues:
- It would have been better for the graph to display the numbers of all the states to make more solid analysis or atleast display all that have strong signals of comparisons. A hovering number map would have worked better to showcase all the numbers. But it would have been more better to have clickable links over the map to expand further about the data pertaining to the state like number of children, rural vs urban, change since NFHS-4 etc.

Here are my visualisations:

The first visual can be changed to the below to showcase effective comparison between Urban and Rural regions along with NFHS-4 and NFHS-5.
The below graphs are in addition to the visuals to:

- display the distribution of the cases amongst children in age-wise segment:

Prevalence of anaemia-2

- display the distribution of states with least prevalent cases and most prevalent cases: State-wise prevalence of anaemia

Observations: The certain data points were not available to make the changes accordingly. Addition of the above data points might bring more clarity in the purpose of the article.

Ramsai9ch commented 2 years ago

Name: RAMSAI REDDY CHAMAKURA

LINK: https://www.thehindu.com/data/data-advantage-bjp-in-bipolar-contests/article65230650.ece?homepage=true

Title: Bipolar contests benefit the BJP in Uttar Pradesh polls

What is the story the author is trying to tell?

The author is trying to explain how UP assembly contests turn into Bipolar Contests and how much the BJP benefitted from it.

The table lists the share of seats involved in uncompetitive (<1 effective party), bipolar (2), triangular (3), fragmented (4), and multiparty (>=5) contests since 2002. In 2002, a majority of the seats (37%) witnessed a three-cornered fight. Until 2017, the political landscape of the U.P., by and large, remained a triangular or multiparty contest. But in 2022, over 70% of the seats saw a bipolar contest, with the share of seats with four parties in the fray reducing to a little over 1%

The author used a Table to show how the constituencies turned into bipolar constituencies.

Drawbacks:

It is difficult to interpret the changes over the years using tables, there are better options to visualize the change
Author used color coding but not sure what threshold they used to give colors
Author left some boxes empty which gave the notion like some values are not present, but the rest of the values will add up to 100.

We can get rid of these drawbacks by using a different kind of graph plots

From the above graphs, we can observe how the trend of contests changed during the years concerning the number of parties clashing prominently for a seat and in an election year how the contests of each type are happening.

The author also tries to show how the BJP benefitted from the change in the contests. Here also Author used a table,

The table lists the BJP’s strike rate (wins/seats contested) in uncompetitive, bipolar, triangular, fragmented, and multiparty contests.

Drawbacks:

It is difficult to interpret the changes over the years using tables, there are better options to visualize these kinds of data
Author used color coding but not sure what threshold they used to give colors
Author left some boxes empty which gave the notion like some values are not present, but the rest of the values will add up to 100.

We can overcome these drawbacks by using bar charts,

With the above graphs, we can interpret how the BJP's fortune changed with the change in the number of prominent parties in the contest.

I have tried to show how the BJP strike rate changed in different kinds of contests (uncompetitive, bipolar, triangular, fragmented, and multiparty contests)across the years and each year how the strike rate of the BJP in different contests.

With these kinds of visualization, users can easily interpret the changes in the political scenarios in terms of the type of contest and how the BJP raised its rank in the last 5 elections.

Observations: Years in which BJP won in elections i.e 2017,2022 BJP strike in the is sky-high compared to previous years, these stats show how BJP has taken advantage of the increase in bipolar contests. In 2017 there were 36% of bipolar happened In which BJP won 84% of the seats when it comes to 2022 the bipolar contest rate increased to 71% in which BJP raised its flag in 71% of seats. Increase in Bipolar contests also means many parties are dissolving and unable to show their impact in elections.

Riikon commented 2 years ago

Name: Arun Sar Article Link: https://www.thehindu.com/data/data-behind-the-numbers-perspective-on-the-400-billion-export-mark/article65381409.ece?homepage=true

Story the author(s) is conveying: That Indian exports have reached record highs of $400 USD. This reflects soaring commodity prices in international markets due to pandemic and the Russia-Ukraine war, rather than India’s increased capacity to participate in the Global supply chain. This is supported by the fact that imports have also reached $600 USD resulting in a widened trade deficit in FY22, highest in two decades and Forex reserves declining to a nine-month low.

Author(s) visualization:

Data Source: Source stated as CMIE.

                                Graph 1 - Graph Comparing Exports vs Imports in Absolute terms

                                        Graph 2 - Graph showing Widening Trade Deficit

                                       Graph 3 - Graph showing Export as a share of GDP

Author Graph 3 - Export as a share of GDP

                                         Graph 4 - Graph showing reducing Forex Reserves

Issue with this visualization: The graph only considers more recent data and not the data from 2001 onwards like the other graphs have used.

Problem with the Story: The problem with the story is in the analysis itself. The analysis doesn't go deep enough to see with there are patterns which can be interpreted in a different way. Also, the article ignores some obviously related data, for example, an article using a dip Forex Reserves(which are maintained in terms of USD) to support it's claims should also consider conversion rate of INR vs USD to see if there is a trend which either validates or invalidates the story the author is trying to convey. Apart from that the story doesn't answer some obvious questions one might have, for example, in Graph 2, we see that the trade deficit was also equally high in 2013-14; is there a reason we see this repeat in 2022 or are they uncorrelated data?

Improvisation:

                                     Graph 5 - Comparing Imports and Exports in Absolute Terms

My Graph 1 - Absolute Terms

This graph clearly shows that India's imports and exports have diverged thrice in the past, so we need to investigate if there is a singular reason behind it or if these events happened for completely different underlying reasons. To further understand this, data of various components of India's imports was collected and visualized using a pie chart.

                                  Graph 6 - Pie Chart showing Components of Indian Imports

The pie chart shows that the maximum percentage of Indian Imports are made up of Oil and minerals. Given that these are commodities which India consumes and does export back after adding value to them (which is true for, for example, automobile parts), an increase in price of these commodities will lead to an increased import bill. This happens primarily due to external conditions which India doesn't have much control over. The following graph plots Brent Crude prices historically.

                                            Graph 7 - Historical Brent Crude Data

Brent Crude Historical Marked

The above visualization shows that Crude Oil Prices peaked during the 2008 US Housing Market Crash and corrected significantly during Covid Pandemic owing to the world wide lockdowns. It's important to note however that low Crude Oil Prices during pandemic didn't help India's Forex reserves as India, under lockdown, wasn't importing much Crude Oil.

The above graph also shows that Crude Oil Prices remained peaked from 2012 to 2015 and then peaked again in 2022, due to Russia-Ukraine war leading to sanctions on Russia. It's interesting to note that the divergence shown by the author in Graph 1 and increase in India's trade deficit in Graph 2, show a strong correlation with increased Crude Oil prices. Also, India has been importing more and more Crude Oil by Volume YoY, which is correlated with size of India's GDP. The following graph visualizes this increase.

                                    Graph 8 - Graph showing volume wise Crude Imports by India

Crue Imports in Volumes

Now, analyzing Forex reserves of a decade show that Indian Forex reserves have been growing both in Rupee as well as Dollar terms. Hence, it's important to note that the current dip in reserves have due to trade deficits have come after reaching a record high.

                                Graph 9 - India's Forex Reserves in INR & USD

Forex Reserves

However, it's important to note that INR has been secularly falling against USD since 2011. This fall would support an increase in Forex reserves by effectively reducing the gap between Imports and Exports and hence keep trade deficit within certain limits. Further investigation needs to done to understand why has the rupee been falling since 2011 and what is it's component of contribution to increased exports and decreased imports in dollar terms.

                                 Graph 10 - Secular Decline of INR since 2011

Secular Decline of INR

Analyzing Exports Component Wise:

A component wise analysis of top 10 categories exported by India also paints a different picture from what the author is trying to say. Four categories - Pharmaceuticals, Electrical and Electronic equipment, Iron and Steel, Machinery & Nuclear reactor and Boiler, show a clear trend of increased exports. These are inline with other data points like, Indian Pharma (value added generic formulation from imported API) contributes to more than 20% of world's generic pharma imports, as of 2019 India has become a net exporter of mobile phones and given the supply chain disruptions that Covid-19 caused many countries and corporates are more inclined to pursue a 'China Plus One' strategy of which India (especially the chemical industry) is a beneficiary.

Export of Pharma products Graph

Export of Electrical and Electronic Equipment Graph

Export of Iron and Steel Graph

Export of Machinery, nuclear reactors and boiler Graph

Conclusion: It can be concluded that the article didn't involve a deep enough analysis. A deeper analysis could have painted a very different picture.

Link to Data Source Used: Forex Reserves: https://www.rbi.org.in/scripts/WSSViewDetail.aspx?PARAM1=2&TYPE=Section Crude Imports (in Volume): https://www.ppac.gov.in/content/212_1_ImportExport.aspx ppac = Pertol Planning and Analysis Cell Category of Indian exports and imports: https://tradingeconomics.com/india/ Historical Brent Crude Price: www.investing.com

sahilshaheen commented 2 years ago

Original article: Pandemic impact: Marked decline in Maths, Science scores among rural, SC/ST students

Introduction

In this article, the authors analyse the 2018 and 2021 NAS (National Achievement Survey) data to explore the impact of the pandemic on education levels across the country and aims to make the point that it had a more pronounced effect on the rural as well as SC/ST populations.

The first part of the article plots the difference of marks across different grades and subjects grouped by different social indicators. The author uses facet grids to plot the data for each group on a different column. The following social indicators are used by the author:

Gender
Rural vs. urban
Social category (caste)

In the second part of the article, the authors use a set of choropleth maps to explore the state-wise breakdown of the data and tries to see the impact of social indicators across states.

Improvements

1. Grouped bar charts instead of facet grids

I felt that this choice is important for two reasons:

The primary comparison is between the social categories rather than the grades/subjects. We want to see how the Grade 3 Science marks difference is more pronounced for SC/ST when compared to General more than how Science and Math education suffered more (though that is important as well).
This helps us see where the trends are similar to the conditions of different groups (General > OBC > SC > ST).

Original facet grid Hindu facet grid

Grouped bar chart

2. Slope graphs instead of choropleth maps

The choice of choropleth maps can only be justified by stating the ease of looking up the data for a particular state. Apart from that, the use of choropleth maps isn't justified as the location of the states and the difference in marks cannot be correlated. The difference in marks rather depends on the government and policies.

I have decided to focus on the time aspect instead of the spatial aspect for the same reasons. In this regard, slope graphs are really useful as the slopes help us gauge which states suffered most, shockingly recorded an increase in marks, etc.

There are still issues with a slope chart. Because the variance of the marks is not too high, there'll be a lot of overlap of slopes and therefore zero readability if we choose to include all the states in the slope chart. We could use an interactive graph to try and solve the issue by using hover mechanisms or we can choose to include only a subset of the data like I have done here.

Original choropleth map Hindu choropleth map

Slope graph

Note: Since I could not obtain the dataset in raw form, the slope graph is only for Grade 3 Language and not average across subjects

Lakshna1295 commented 2 years ago

Name : Lakshimi Naraayani Link: https://www.thehindu.com/data/data-how-many-indians-own-a-fridge-ac-or-a-washing-machine-a-state-wise-split/article65526597.ece

The maps show the share of households in each State that owns a household good. The overall share of Indians who own the commodity is also mentioned.

The story the author wants to convey: The author's visualizations depict the state-by-state percent distribution of equipment like washing machines, televisions, and air conditioners, as well as the total number of people who own all three. Depending on who owns the property, different inferences are formed. Only 16% of houses have a television, refrigerator, or washing machine, indicating that these items are still considered luxury items.

Author's visualizations: MicrosoftTeams-image (8)

MicrosoftTeams-image (9)

MicrosoftTeams-image (11)

MicrosoftTeams-image (12)

MicrosoftTeams-image (10)

Problems with the story:

Since multiple gradients of the different hue are utilised, it is difficult to find a correlation between the usages of different colors.
There is no mention of any threshold for grouping or having the same color in different states, which makes it baseless for reasoning.
There is a high-level mention of what the color coding is for.

My visualizations:

Improvisation: I tried making a basic bar graph out of it, which shows the states that have the highest % of people owning the specific appliances. It is also straightforward to keep track of the states that match the pattern without looking for the numbers.

tanyaahuja147 commented 2 years ago

Link to the dataset - https://rsf.org/en/index Link to Hindu Data Point - https://www.thehindu.com/data/data-the-worrying-state-of-press-freedom-in-india/article65384769.ece?homepage=true The article talks about the fact that India, the largest democracy in the world, is losing its rank in the press freedom ranking year by year. The following chart shows India is in the same group as countries like China, Russia, and North Korea. These countries are not even Democracies and India is being compared to these countries

In the above graph, Ranks (x axis) and Freedom scores are Continuous Data. Here, Ranks are presented as intervals. The country names are Nominal data. In the above Graph, shows prominent countries in each cluster and does a good job in highlighting the comparison between the score range and range of rank. Another feature to compare is the average difference is scores in each cluster that is, the countries between rank 60-89 are very close to each other in terms of score and are comparable but, in Rank group 150-180 India and North Korea might be in the same cluster but have a huge gap in terms of score thus not comparable. It protects a country to fall under the wrong impression of being as bad as or as good as any other country just because they have close by ranks. One point of improvement can be that the black lines that point out individual countries score is not clear. Which horizontal line is it exactly pointing to? That exact horizontal line could have been in a darker hue. The next graph is the one I would like to redesign-

The freedom index of a country is measured on 5 parameters. That is, Political Context, Legal framework, Economic context, Sociocultural Context and Safety an average of all these scores is taken to decide the final score. The above graph uses a bar graph to highlight the ranks. The grey markings on the graph are unnecessary and give no information at all and create a lot of visual clutter on the graph. The point that the author is trying to convey is how alarming it is that India’s rank is going down. To convey that more powerfully the scores could have been taken in a descending order.

arunimamor commented 2 years ago

Name: ARUNIMA MOR

LINK: https://www.thehindu.com/data/data-indian-shuttlers-overcame-poor-head-to-head-records-and-higher-ranked-players-to-win-thomas-cup/article65426374.ece?homepage=true

Title: Indian shuttlers overcame poor head-to-head records and higher ranked players to win Thomas cup

What is the story the author is trying to tell? The author is trying to emphasize on how India won the Thomas for the very first time by beating by the 14 times champions Indonesia even with poor head to head records.

Visual 1 : The author started by showing the difference in head to head records and difference in ranks.

Observation :

As per me, this graph is an apt representation as it is able to stress on the fact how they was a stark difference in the rank of the Indian players versus opponent players and how the past records were very poor by displaying difference in head-to-head wins.
The color coding is able to highlight the difference properly

Visual 2 : Then he starts explaining by pinpointing the fact that last time India was able to qualify for semi-finals was 14 years back.

Observation :

It takes a lot of effort for the user to go through the color coded and understand the labels.
Its not quite evident from the visual that this is the first time India has won Thomas Cup. It would take some time for the user to arrive at that conclusion.

My Suggestion : India's Performance over last 55 years -

Indonesia's Performance over last 55 years -

Visual 3 & 4: The author then goes to talk about ranking of the players country-wise through tabular data

Observation :

User will have to spend a lot of time analyzing this data.

My Suggestion :

pawanreddy-u commented 2 years ago

Name: Pawan Reddy Ulindala Article title: How rising space debris will impact ISRO’s budget Article link: https://www.thehindu.com/data/data-how-rising-space-debris-will-impact-isros-budget/article65289373.ece?homepage=true

This article talks about the increasing spend of ISRO's budget to avoid potential collisions with the space debris. Graph-1 The below graph talks about the exponential rise in number of rocket launches and number of payloads that has been carried through these rockets.

The above chart can be improved by having two seperate bar graphs, one with no.of launches in an year and other with no.of payloads in an year.

Graph-2 The below graph talks about the Country-wise contributions of space debris.

I feel that the above graph gives a fair idea that majority of the space debris is contributed by two countries namely USA and Russia. The data of budgetary spends (direct and indirect spends) can be included in a graph to let the reader know about the current spends by countries on space debris which were populated mainly by Russia and USA. Given that this is sensitive information, I couldn’t find a source to include this data as part of a graph.

Graph-3 The below chart shows the increasing collision avoidance manoeuvres carried out by India to bypass orbital debris.

While this graph captures the said information, it fails to capture the rise in count of objects in space debris. A graph on ratio of ‘collision avoidance manoeuvres’ to ‘active space debris’ each will help use to find the trend and even forecast the future count of ‘collision avoidance manoeuvres’.

uneetkumarsingh commented 2 years ago

Story: India, China help lift Russia’s post-war crude oil exports https://www.thehindu.com/data/data-india-china-help-lift-russias-post-war-crude-oil-exports/article65533006.ece?homepage=true

Story the author is trying to tell In the Aftermath of Russian-Ukraine War, there has been discussion about financing of the Russian War. European Geopolitics has centred around curtailing energy dependence of Europe on Russia. However, despite sanctions and calls to boycott Russian oil, Russian Oil export has increased YoY after the war. Author attempts to put forth role of India and China in enabling this growth in Russian Oil Export.

Data Description: This is a secondary research. Author has not directly dealt with primary data and have instead reproduced graphs from the quoted sources. Quoted sources have used time series data to show the YoY export comparison. They have restricted themselves to last two years of data. Data is uni-variate as they are only referring to crude oil export.

Gap in the Data: Given that story is trying to understand India and China's import behaviour, it would have been beneficial to bring in India and China specific timeseries data.

How is it encoded? Russia's overall crude export trajectory is compared through time-series plots.

Below graph then shows how Russian oil import to Non-G7 and Non-EU Countries changed over time.

Below is a table to show India's import bucket:

Problems with it Plot of Non-EU and Non-G7 countries in no way reflect/capture the role of India and China in particular. India's import bucket has been shown as a table which can be visualised.

How to improve it? Story needs India and China specific Data. It can be shown how the Crude Oil bucket composition for India-China changes in terms of source-country over time. To highlight the increasing support India and China is providing to Russia in terms of financing the war, It would have made sense to track how the Russia's own crude oil import bucket composition has changed over time in terms of destination country. In both these cases Percent Stacked Area Chart would have conveyed story in a better way!

While Graph-1 proves that Russian Oil Export is increasing and Graph-2 is showing that exports to non-EU and non-G7 countries have increased even more, it is nowhere proved that this upsurge is led by India and China.

Improvised Chart To make it more granular, I collected data from Ministry of commerce website and showed how India's crude oil bill composition has changed over time. Russia's share in India' crude oil bill has remained marginal till date. Even thought it has increased, it still constitutes marginal share of India's crude oil bill (2.1%). Composition of Crude Oil Import Bill of India

Harleen8-Bagga commented 2 years ago

Name: Harleen Kaur Bagga Article Title: Domestic violence complaints received in the past five months reach a 21-year high Article link:https://www.thehindu.com/data/data-domestic-violence-complaints-received-in-past-five-months-reach-a-21-year-high/article34877182.ece

The amount of domestic abuse complaints received in the last five months is highlighted in the article. The report compares the frequency of complaints filed in various states across three different dimensions: "Never sought help and never told anyone"; "Never sought help but told someone"; and "Sought help." A smooth line graph depiction depicting domestic abuse complaints received over the last 21 years is one of the visualisations utilised here.

While this illustration emphasizes the amount, it does not effectively convey the comparison of these figures across time. I created a twin plot with years on the x-axis, % of complaints on the left y-axis, and total number of complaints on the right y-axis. download1

The total complaints reported in a State vs the number of complaints received per one-million women between January and May 2021 are being displayed using scatter plots. As we hover over the dots in scatter-plot we come to know which state it depicts.

I tried to make it more understandable by utilising a line graph with the relevant states on the x-axis and the complaints registered throughout each state on the y-axis. download3 In the third and fourth table(table depicting silent victims and table depicting small share of women). Important insights such as comparisons across multiple dimensions corresponding to distinct states are absent because there is a heavy focus on exact numbers rather than patterns or comparisons across dimensions. I attempted to represent the same across each state using a bar plot of all three dimensions.

download2

download4

Apharna commented 2 years ago

Name: Apharna M L Article: Import-reliant India’s edible oil prices skyrocket, average costs near ₹200/litre (https://www.thehindu.com/data/data-import-reliant-indias-edible-oil-prices-skyrocket-average-costs-near-200litre/article65466538.ece?homepage=true)

This article by author Jasmin Nihalani on May 27, 2022 dives into the soaring rates of oil in india and India's dependency to imports for consumption. The context of the article is bid around the frame where the central government had allowed duty-free import of 20 lakh tonnes of crude soyabean & curde sunflower oils.

The prices of sunflower oil in retail markets have surged by ₹39 per litre in the past three months. In 2019-20, 56% of the domestic edible oil demand was met through imports. However, Russia’s invasion of Ukraine and Indonesia’s policy flip-flops have left India scrambling for alternatives. The local prices of edible oil are vulnerable to changes in international markets given India’s high reliance on imports.

The article uses 4 graphs to elucidate its 2 key points : The reliance on imports & soaring prices. The 4 graphics are:

Preferred Oil
Price Variation
% of edible oil demand
Indian import of Vegetable oils in 21-22

The key data source used for all the graphs are FCA, MOSPI, CMIE.

The key pain point with most of these graphics are that they compensate legibility for fancy & aesthetic values.

The first graph on preferred oils in the article is listed below:

The key problem is the readability between the legend and the graph, eg given that vanaspati is on the top, the first field on the graph would be expected to be it but it actually is the field others. Also the actual comparison between the rural and urban usage metrics is not pronounced with this graphics. To over come these, I've plotted the data as a combined bar chart:

The second graph on the price changes is a heat map as shown below:

While it aesthetically conveys the sentiment behind the increasing prices of oil over the years, the graphic is data poor and needs a lot of hovering over to indicate the price difference between current date and the time period mentioned (not the price of the oil at the given date) I have replaced this visualization with a line graph on the prices of the oil at the particular date and arranged them on the descending scale, to indicate both the increase in price over the years and price at various points in time.

The third graph gives the % of edible oil demand per source:

This graph even though is clear could benefit hugely by moving the year to the y-axis to improve its readability.

This graph with its shrinking blue cover indicates the reduction in the % demand met by domestic sources and directly proves the reliance of imports and thus exposes the potential flaws in the policy contributing to the price raise as elucidated in Graph 2.

The fourth and final graph is the chart showing countries India depends on for import:

The graph with no title and axes information is confusing and does not provide the information of India's dependency easily. The better way to visualize this information would be to depict it as a pie chart wherein we can understand easily the countries India is dependent on and this can help us to easily access, predict and formulate oil prices based on this dependency.

LogeshKG commented 2 years ago

Name: Logesh Kumar G Article: Where does India stand on the global hunger index? Link: https://www.thehindu.com/data/where-does-india-stand-on-the-global-hunger-index/article37140124.ece?homepage=true Source: The Hindu on Oct 23, 2021

The Story: The article shares the indicator of an increase in India's rank in the Global Hunger Index since 2016.

Description of the data and the visualizations: The article uses the Global Hunger Index data to depict the downward trend of India's rank which is shooting up since 2016 (now among the 31 nations where hunger has been classified as "serious"). The various charts/graphs used to display the trends are tables, line graphs, and scatter charts.

Original Description

Four indicators were used to compute the score — share of the population that is undernourished, share of children under five who are wasted (low weight for height), share of of children under five who are stunted ((low height for age), and the under-five moratlity rate. Among these, while wasting has increased compared to 2012, stunting and mortality have reduced. The results of the National Family Health Survey-5 (2019-20) also showed that in the majority of the States for which data were released, stunting and wasting increased compared to the 2015-16 survey round.

Found it filled with spelling errors and difficult to interpret information from the plain paragraph

Re-organised Description

_The four indicators used to compute the scores are:

Share of the population that is undernourished
Share of children under five who are wasted (low weight for height)
Share of children under five who are stunted ((low height for age)
The under-five mortality rate - reduced compared to 2012

Wasting - Increased compared to 2012 & 2015-2016 survey rounds. Stunting - Reduced compared to 2012, increased compared to 2015-16 survey round. Under-five mortality rate - Reduced compared to 2012, increased compared to 2015-16 survey round._

Representing the tabular data in a simple side by side bar chart is making it easier to visualize

Issues with the visualizations

Visual 1:

Issues: While the tabular data gives the picture of the increasing rank (downward trend) of India, it's difficult to interpret the proportion of rank to the number of countries analyzed.

Improvements Mentioning the "rank / total number of countries" would give a slightly better picture. eg. 101/116. An additional column to depict at what percentage India's rank stands will give a clearer representation.

Visual 2:

Explains the percentage of each of the four indicators observed in the total target population using a line graph.

Overlapping of graphs could've been avoided by reducing the thickness of the line graph.

Visual 3:
Visual 4:

Issues: In both visual 3 & visual 4, difficult to interpret the change in percentage that the verticle axis is trying to emphasize. Visual 4 has the wrong title of "wasted" for the horizontal axis instead of "stunting".

Improvements Side by Side bar charts for years 2012 & 2015-16 for each of the states will be a better representation.

sarthak78 commented 2 years ago

Name: Sarthak Gupta Topic: How many Indians eat meat Article Link: https://www.thehindu.com/data/data-how-many-indians-eat-meat/article65299234.ece?homepage=true The main subject of the article:

The article focuses on how many Indians consume meat. The data has been taken from the National Family Health survet-5 which shows how meat is consumed in different states of India. It states that, in over half of the 30n states/UT’s, more than 90% of the population consumed fish or chicken or meat daily, weekly or occasionally. It also highlights that in 25 of the states, the percentage of meat consumers are more than 50% and in some states, the share was less than 20%.

The visualization used: The author has used a map to visualize the percentage population of meat consumers in every state and used a scale to visualize it. The range had been divided into 5 increasing bins starting from least to largest, also when you select any state through the cursor, it shows the exact amount of meat consumers

The bins start from the color green and goes through the darker gradient of red

download

Issues with the visualizations:

The visualization only highlights the number of meat consumers based on state-wise consumption but doesn't go to more detailed levels.
Comparison between genders
area wise consumption within states

I have first visualized the meat consumption data according to gender: men and women

The pie charts are used to depict the share of different types of meat consumed by the non-vegetarians in India.

Men's consumption

pattern:-

Women's consumption pattern

Now, the meat consumption patterns based on cities can also be visualized- In this visualization, I have used split bars to show how meat consumption differs based on categories in different states in India and also depicts that out of the total population, how many of them consume meat

state_level_pattern

From the above visualization, we can also analyze the type of meat consumption per state which gives better insights into the consumption patterns of the population

Apoorva2908 commented 2 years ago

Name - Apoorva S Shekhawat

Title of Article - India’s press freedom ranking slips to 150, its lowest ever Link to the original article - https://www.thehindu.com/data/data-the-worrying-state-of-press-freedom-in-india/article65384769.ece?homepage=true Data Used : World press freedom ranking index issued by Reporters Without Borders Tool Used : MS Excel

### Story the author is trying to tell The author through this article discusses how India's press freedom index ranking is at its lowest ever, India's ranking for 2022 was 150, 8 positions lower than last year.

Visualization 1:

Issues with the visualization:

Its difficult to understand in first look
Random color coding
Can't see all the countries in this

Redesigned Visualization

I used a gradient color coding where the lighter shades are the lower and the better ranks, the yellow and the darker yellow lie in the middle range and the shades of brown tells the worst ranked countries and the area which is black has no data points.
All the countries can be seen at once in this visualization
Tooltips or hovercards will provide details when pointed on one country.

Visualization 2:

Issues with the Visualization:

The labels over the bars is difficult to read.
It is not clear whether the higher rank is better or worse.
The color for all the categories is same showing no difference.

Redesigned Visualization:

Used different colors for different categories.
Arranged the ranking across the categories and added a best to worst indicator to indicate the lower the rank the better it is.
Categories are written below the axis and hence are easily readable.

Additional visualizations and data:

Data resource : https://en.unesco.org/themes/safety-journalists/observatory?field_journalists_date_killed_value%5Bmin%5D%5Byear%5D=2020&field_journalists_date_killed_value%5Bmax%5D%5Byear%5D=2020&field_journalists_gender_value_i18n=All&field_journalists_nationality_tid_i18n=All&field_journalists_local_value_i18n=All&field_journalists_status_value_i18n=All&field_journalists_type_of_media_tid_i18n=All&field_journalists_judicial_tid=All&field_unesco_region_value_i18n=All

From the above visualization I observed that India performed worst in the category - 'Safety of Journalists', so I looked up for the data of how many journalists are killed every year in India and got the data from 1995-2022(till present) from the UNESCO site. Also got the major reasons of the death of journalists was also given in the data and I plotted the same.

Observations: The killings of journalists has increased recently with 35 journalists killed in last 8 years out of total 54 journalists killed since 1995. Out of 54 journalists 70%(38) were murdered.

BRupani commented 2 years ago

Name: Bhawna Rupani Topic: Wealth increase among recontesting candidates of Dravidian political parties Source: https://www.thehindu.com/data/data-wealth-increase-among-recontesting-candidates-of-dravidian-political-parties/article34551188.ece?homepage=true

Visual Story : In the last five years, the average wealth of re-contesting candidates from the AIADMK increased by 171% compared to 48% of those from the DMK A comparison of the wealth details of re-contesting candidates of the two major parties in Tamil Nadu, DMK and AIADMK, in the Assembly election 2021 compared to 2016 shows that the average assets of candidates from the AIADMK recorded a much higher increase than those from the DMK (who also increased their assets over the same period). The calculations were made from candidates’ affidavits and data compiled by Association for Democratic Reforms (ADR).

Party-wise split

More than 90% of candiatates who re-contested in the Tamil Nadu Assembly election from both the DMK and the AIADMK had assets more than ₹1 crore. According to the affidavits filed, the average assets of re-contesting candidates from the DMK was ₹13.55 crore while that of the AIADMK was 8.34 crore.

Increase from 2016

In the last five years, the average wealth of re-contesting candidates from the AIADMK increased by 171% compared to 48% of those from the DMK. The average assests of 9% of the AIADMK candidates grew by more 500% in this period while that of 45% of them more than doubled.

SC/ST candidates

The average assets of the SC/ST candidates who re-contested from the AIADMK rose by 289% compared to the previous election while they increased by 55% for the five SC/ST candidates who contested again from the DMK.

Women candidates

The average assets of the women candidates who re-contested from the AIADMK rose by 250% compared to the previous election while they increased by 43% for those from the DMK.

Highest increases

The table lists the five re-contesting MLAs whose absolute wealth increased the highest between 2016 and 2021.

MY VISUALIZATION:

Chart : Bar Chart x axis: Political Parties y axis: Increase/Decrease in assets

Chart : Income wise distribution x axis: Political Party y axes: Total annual income + increase/decrease in assets

Horizontal stacked chart x axis: percentage increase/decrease in assets y axis: districts of political parties in Tamil Nadu color coding: political parties

Constitutional Horizontal chart x axis: percentage increase/decrease in assets y axis: constituencies of political parties in Tamil Nadu

_Districtwise Horizontal chart x axis: percentage increase/decrease in assets y axis: districts of political parties in Tamil Nadu

arnav-tlf commented 2 years ago

Name : Arnav Sharma Article : Only 8% of children in rural areas studied online regularly in August

The article highlights the pattern observed among children reagrading their study behaviour as it has eveloved to - in nearly 1.5 years into the Covid- 19 pandemic with schools closed and online learning picking up pace exponentially. The authors have drawn contrast between children in rural areas with childeren living in urban areas to draw the distinction of each's pace in adoption of technology for online learning. It also hints towards the availility of resources among urban and rural children.

The author is using tables to note raw data but not using visualisations at all to have a visual impact on the reader. The precentages mentioned don't add up to a total of 100 which is confusing in itself.

Screenshot 2022-06-20 102700

I have used pie charts to sepeartely show the study behaviour among urban children and rural children. I have adjusted the percentage scales to 100 after merging redundant columns. The pie charts make it easier to visualize the share of activities.

Screenshot 2022-06-20 102552

Screenshot 2022-06-20 102851

Again, the share of study behaviour is not graphically represented by the authors.

Screenshot 2022-06-20 102914

I have used bar charts to plot percentages which here make it easy to compare study behaviour since the plots for rural and urban children are adjacent.

Screenshot 2022-06-20 103313

rishi456187 commented 2 years ago

Smart cities of India Original Article link – https://www.thehindu.com/data/what-is-the-status-of-smart-city-projects-in-india/article28441952.ece Summary of the story – the idea of developing 100 Indian cities in a way that they can be referred as smart cities, is yet to be realized, the author shows with the data that how little the funds that has been approved are utilized. Critique in current viz.

The data uses pie chart and doesn’t display the information related to the data. Also, the color combination used is not specified. The units are not mentioned in the chart, nor the percentage. Very less intuitive to understand.

Now, in the second graph, the chart is not an accurate measure. It can be done in a better way

A better visualization. A funnel chart is a better way to represent the stages of work, it shows that only 917 projects have been completed so far, while the majority of the projects are still under progress.

The clustered bar chart shows the amount that is spent during the different stages of the projects.

Lastly, the funnel chart to display the journey of the funds, the color coding displays how little is the utilization of the approved funds.

saivenkatrammallela commented 2 years ago

Article: https://www.thehindu.com/data/data-pandemic-impact-marked-decline-in-maths-science-scores-among-rural-scst-students/article65479807.ece?homepage=true

The article tries to understand the impact of the pandemic on the performance of students undergoing school education. It tried to compare the examinations results of 2018/2019 with that of 2021. A group-wise comparison was made to understand the most affected groups and the reason behind the depreciation.

Scope for a change in visualisation:

The article used percentages throughout except in the marks table used to compare changes among different states and subjects. However, comparing change among different entities could be done easier with percentages or a bar graph like the once used throughout would have done the job as well as preserved the consistency however, it would have to be done with pop up windows given the huge number of entities being represented. Hence, if the table contained a column that had change represented as a percentage, it would have been a little easier to comprehend. The dataset is not accessible in numerical form so couldn't implement the change mentioned.

krishna151 commented 2 years ago

Name: Krishna Rajagopal Article: https://www.thehindu.com/data/data-wealth-increase-among-recontesting-candidates-of-dravidian-political-parties/article34551188.ece?homepage=true Data : https://adrindia.org/content/analysis-assets-comparison-re-contesting-mlas-tamil-nadu-assembly-election-2021

Story the author is trying to tell: A comparison of the wealth details of re-contesting candidates of the two major parties in Tamil Nadu, DMK and AIADMK, in the Assembly election 2021 compared to 2016.

Data Description: The calculations were made from candidates’ affidavits and data compiled by Association for Democratic Reforms (ADR).

Gap in the Data: Given that story is trying to understand re-contesting candidates of Dravidian political parties behaviour, it would have been beneficial to look at data of switching sides between parties and effect on asset accumulation.

How is it encoded? All data points are encoded in Tables.

Visualization 1:

Party-wise split

More than 90% of candidates who re-contested in the Tamil Nadu Assembly election from both the DMK and the AIADMK had assets more than ₹1 crore. According to the affidavits filed, the average assets of re-contesting candidates from the DMK was ₹13.55 crore while that of the AIADMK was 8.34 crore.

Visualization 2:

Increase from 2016

In the last five years, the average wealth of re-contesting candidates from the AIADMK increased by 171% compared to 48% of those from the DMK.

Visualization 3:

Highest increases The table lists the five re-contesting MLAs whose absolute wealth increased the highest between 2016 and 2021.

Problems with it Tables dont intuitively reflect/capture the Political parties being discussed.

Improvised Charts:

To make it more granular, I collected data ADR website, cleaned and changed the visualizations from tabular form. Gave colors related to the respective political parties to make the data representation more relatable.

Improvement for Viz1: Party-wise split

Increase in the Asset allocated between 2016 and 2021

Improvement for Viz 2: Percentage Increase from 2016

Improvement for Viz 3: Top 10 Asset accumulators for the given period

Santoshsrini commented 2 years ago

Name: Santosh Srinivas

Article: https://www.thehindu.com/data/indians-can-travel-visa-free-to-only-58-countries/article37139121.ece?homepage=true

Story: The author is trying to tell how the strength of the Indian Passport has weakened over the decade by indicating the decrease in its ranking index. The index ranks normal passports of countries according to the number of destinations its holders can visit without a visa or can avail themselves of a visa, a visitor's permit, or an electronic travel authority on arrival.

Data visualization 1:

This data is represented in the form of a table which contains each country's name, its ranking index and the number of foreign countries that can be visited with VISA free access. The type of data is tabular with dimension of 3 columns.

data1

Alternate visualization technique:

To signify the difference in the number of foreign countries with visa-free access for different ranks, a bar graph could be more visually meaningful. One can easily notice the differences.

data2

Data Visualization 2:

This data is represented in the form of a line graph as it shows the ranking index of index over time. I feel this is the best way of representing the data. Only few data points are purposefully highlighted.

data3

Data Visualization 3:

The data is trying to tell the story of the current rank in 2021 and how it has changed since 2011. The visualization technique is used of scatter plot in a 2D plane. One potential disadvantage is that it looks a bit clustered and only 7 of the countries have been highlighted.

data4

Data Visualization 4:

The chart plots the total number of visa-free destinations for a country in 2021 against the change in such destinations for the country from 2011.. The visualization technique is used of scatter plot in a 2D plane. One potential disadvantage is similar to the previous one.

data5

subratsaxena commented 2 years ago

Name: Subrat Mahim Saxena Article: Quality of jobs will decline further as Agnipath scheme lacks social security benefits Link: https://www.thehindu.com/data/data-quality-of-jobs-will-decline-further-as-agnipath-scheme-lacks-social-security-benefits/article65553159.ece

This article focuses on the share of vulnerable jobs among the salaried/employed workers in India. The statement made by the article is "Given that a majority of regular wage/salaried employees in India already lack social security benefits, the lack of pension or gratuity benefits for people recruited under the Agnipath scheme may deteriorate the quality of jobs further."

To support this statement the article provides the following data in tabular form as follows:

Vulnerable Job Share:

Gender Divide:

The above tables show data in three categories for three time periods. As you may observe the change in data for different categories is not visible via numbers as the delta is very low. Therefore this table does not solve the purpose of emphasising on the statement.

I have recreated visualizations for these data points as follows:

State Wise Share:

This graph seems unstructured as there are no proper legends for user to follow while studying this graph. It is not clear what the circle signifies or star or cross. However, there are certain statements through which you can depict which one is which. But. still these statements are unclear and need to be more specific. The data is not available on PLFS website directly.

The writer concludes by writing that there hasn't been much change in vulnerable job shares over the period of 4 years, but the numbers ~60% is still very high and the job scheme like Agnipath will contribute to it further. We should also consider that the percent increase in vulnerable job share will also affect people with business share in India or investments in future. The story seems incomplete without proper conclusion.

01supriya commented 2 years ago

Name : Supriya Kumar Mishra Topic : Agnipath: With 34 lakh military pensioners, pensions form >50% of defence budget Article : https://www.thehindu.com/data/data-agnipath-34-lakh-military-pensioners-pensions-form-50-defence-budget/article65545319.ece?homepage=true Tools Used : MS Excel

The author's attempt to tell a story.

The author wants to discuss how India spends the majority of its defense budget on pensions for servicemen while allocating less to research and development of new defense equipment.

Visualization 1: Share of defense budget

The chart depicts the defense pension and R&D budgets as a percentage of total defense revenue expenditure over time. Currently, less than 5% of revenue expenditure is allocated to R&D, while more than 50% is spent on pensions.

Visualization 2: Smaller capital expenditure

The graph depicts revenue and capital expenditures as a percentage of the total defense budget over time. Capital expenditure, which is used to purchase equipment, vehicles, and aircraft, is receiving a smaller proportion of the budget.

Visualization 3: Growing in number

The graph involves the comparison of Union government pensioners across various departments from January 2014 (blue) to March 2021. (red).

Improvised Visualization.

Visualization 4: Share of defense budget - Pension vs R&D as an Expenditure

This visualization conveys an impression of how much budget is allocated to R&D and emphasizes it on the right scale, while the left scale represents how much budget is allocated to pension. The graph depicts the defense pension and research and development budgets as a percentage of total defense revenue expenditure over time.

Visualization 5: Pension Budget vs Budget for Defense equipment.

Over time, the graph depicts revenue and capital expenditures as a percentage of total defense budget. Capital expenditure, which is used to buy equipment, vehicles, and aircraft, is getting a smaller share of the budget. This graph depicts how much of the defense budget is allocated to each component out of a total of 100 percent. This clearly demonstrates how pensions consume a large portion of the budget and how less attention is paid to purchasing equipment, vehicles, aircraft, vehicles, and so on. This alone demonstrates the need for a scheme like Agnipath, which would reduce long-term pension dependency.

Visualization 6: Growing number of active pensioners across 3 Departments

The graph compares Union government pensioners across departments from January 2014 (blue) to March 2021(red) and number of pensioners in lakhs. As we can see, the gap for defense pensioners is growing, with over 10 lakh pensioners added in the last seven years. If this trend continues, the government might well face a financial burden in the upcoming years.