bsc-iitm / Data-Visualization-Design-CS4001

5 stars 0 forks source link

Graded Assignment -4 (May Term 2024):- Redesigning The Hindu Data Point Stories #31

Open Jimmi-Kr opened 1 month ago

Jimmi-Kr commented 1 month ago

For this assignment, we'll use data stories from The Hindu Data Point. Use what you have learned in Week 4 & Week 5 for doing this assignment.

Select a story that you like, study it carefully, and redesign it. Specifically, we want you to focus on understanding the data that powers the story, and how it is visually encoded to tell the intended story. Document your design process, capturing the following:

You may choose to expand or curtail the scope of the data used in the story or add an additional dataset to tell the story better. But do not deviate from the main intent of the original story. In other words, it is a redesign exercise, and hence I do not want you to tell a different, unrelated story.

While you should provide a link to the original story, it might be useful to capture and display inline, appropriate parts of the original visualization, and your own design iterations to produce coherent documentation.

For reference, take a look at what the previous batches (2019,2020,2021, 2022 )did with this assignment.

sahilrajpal121 commented 1 month ago

---WIP--- Name - Sahil Rajpal Roll - 21f1006804

Original Article: Share of Women across Employment Sectors (link)

Summary: The recently released Annual Survey of Unincorporated Sector 2022-23 reveals that the share of women owners and workers in unincorporated enterprises is relatively high in the southern states of India. The unincorporated sector encompasses a variety of jobs, from street vending to tailoring and car repair, which require different levels of capital and skill. This sector includes individual-operated or self-employed enterprises involving unpaid family members or paid workers. It excludes agricultural establishments, registered companies, and public sector/government companies.

Key Insights:

Visual Representation: Screenshot (735)

The provided chart is a scatter plot with circles representing states, differentiated by regions through colors. It categorizes women workers into unpaid family members, informal/formal hired workers, and working owners in various sectors of unincorporated enterprises. The southern states are positioned towards the right, indicating a higher share of women in the workforce.

Visual Critique:

  1. Current Chart Analysis:

    • The chart is a scatter plot with different sections indicating the share of women across various job types in unincorporated enterprises.
    • Each circle represents a state, with color coding to differentiate regions.
  2. Issues with the Current Chart:

    • It can be challenging to identify and compare specific states effectively due to overlapping circles.
    • The absence of labeling can make the chart less intuitive.

Redesign Iterations

PS: All charts below are interactive. Tooltips provide further information about the dataset

Iteration-1

In this attempt, I tested out Strip Plot on a mock, comparatively smaller dataset. image

Iteration-2

Here, I created Treemaps, one for each employment type, sorted by Percentage share of women.

image

Iteration-3

Furthermore, I visualized the story in the form of a Heatmap. image

Final Iteration

Finally, I improved upon the last iteration and presented the story using a Split Bar Chart

image

Ashutosh-tec commented 1 month ago

Name: Ashutosh Kumar Barmwal Roll: 21f1001709

Documentation Original Story Title: Economic Confidence Slips: RBI Survey Reveals Growing Pessimism Amid Inflation Concerns

Redesign Documentation

Story the Author is Trying to Tell

The author highlights a recent decline in economic confidence among urban households in India. The focus is on four key areas:

  1. General Economic Climate: Tracking changes in public perception over time.
  2. Employment Situation: Assessing optimism or pessimism about job prospects.
  3. Price Levels: Understanding public perception of inflation.
  4. Income Levels: Evaluating changes in perceived income levels.

The narrative emphasizes that after a steady recovery post-COVID, confidence in the economy has recently dipped.

Data Used to Tell the Story

Data Details

Essential vs. Irrelevant Data

Visual Encoding and Problems

Current Encoding

  1. image

  2. image

  3. image

Improvements Attempted

  1. Clearer Segmentation: Use distinct colors for each key aspect (economic climate, employment, price levels, income levels) to reduce clutter.
  2. Consistent Color Coding: Implement a consistent color scheme to differentiate between positive and negative perceptions.
  3. Annotated Highlights: Add annotations to highlight key turning points or significant changes.

Redesign Process

  1. General Economic Climate & Employment Situation

    • Original: Single line chart.
    • Redesign:
      • Use two lines: one for "Improved" and one for "Worsened."
      • Annotate key events (e.g., significant economic policies, global events).
      • Highlight the recent decline in confidence.
  2. Price Levels & Income Levels

    • Original: Combined line chart.
    • Redesign:
      • Use distinct lines for "Increased" and "Decreased" perceptions.
      • Consistent color scheme (e.g., blue for increased, red for decreased).
      • Annotations to explain the persistent high perception of increased prices and income changes.

Redesigning Charts: Trying to get the data.

Thank You,

neeraj-iit commented 1 month ago

Name: Neeraj Yadav Roll No: 21f1005729

Main Story: The article provides insights into the cities that have the highest share of students scoring 650 or above in the NEET UG 2024 exams. It highlights the significance of these scores for securing admissions in government medical colleges and identifies top cities and centers contributing to these high scores.

Data Used:

Type of Data: Quantitative data on student scores from the NEET UG 2024 exams. Extent of Data: The dataset includes scores from all candidates who appeared for NEET UG 2024, focusing on those scoring 650 and above. Dimensions of the Data: The data includes variables such as candidate scores, cities, states, and specific educational centers. Gaps in the Data: The article does not provide detailed demographic information or historical comparison data. Relevance: Essential data includes candidate scores and their respective cities and centers. Irrelevant data might include unrelated demographic details not covered in the story.

Current Visual Encoding:

Chart 1: A scatter chart displaying the percentage of students scoring above 650 marks across different cities.

Fig 1

Table 2: A table listing the top centers with the highest share of candidates scoring above 650 marks.

Fig 2

Problems with Current Encoding:

Scatter Chart: Cluttered Data Points: The scatter chart is densely populated, making it difficult to distinguish individual data points. Color Gradient: The color gradient from 0 to 7.48% might not be intuitive for quick interpretation. Table: Limited Information: The table lists the top ten centers but does not provide additional context or comparisons. Lack of Visual Appeal: The table is plain and could benefit from visual enhancements for better readability.

Redesigning the Visualization

Improvement Plan:

Simplify and Clarify: Create clearer, more intuitive charts that highlight key insights without overwhelming the viewer. Use Effective Visual Elements: Utilize bar charts, heat maps, and annotated visualizations to emphasize important data points. Enhance Readability: Ensure all visualizations have clear labels, legends, and titles.

Redesigned Visualizations:

Bar Chart: Displaying the top cities with the highest share of students scoring above 650 marks. Heat Map: Showing the concentration of high scores across different states. Annotated Visuals: Highlighting the top-performing centers and cities. Redesigned Bar Chart:

Figure_1

Redesigned Heat Map:

Figure_2

Documentation

Original Story:

Link to the original story: NEET UG 2024: Data reveals top cities for high-scoring candidates

Redesign Documentation:

Bar Chart: The bar chart simplifies the data by focusing on the top cities, making it easier to compare their performance. Heat Map: The heat map provides a clear visual representation of high score concentrations across states. Annotations and Highlights: Annotations emphasize key data points, such as the highest-performing city, to draw the viewer's attention.

These redesigned visualizations aim to improve the clarity and storytelling of the data, making it more accessible and easier to interpret for the audience.

45sajal commented 1 month ago

Name: Sajal Dhingra Roll: 21f2001213

STORY TAKEN FOR REVIEW

Title: A green wealth tax in Budget 2024

Story which publisher is trying to convey

The new government in India is set to present its Budget 2024, addressing critical issues of unemployment and inequality. A proposed solution is a wealth tax-financed Indian Green Deal (IGD) aimed at tackling climate change, inequality, and joblessness. Rising inequality and the carbon footprint of the wealthiest 10% in India have contributed to increased emissions, driven by their consumption of carbon-intensive goods.

The IGD would focus on green energy, infrastructure, and the care economy (health and education), modeled after the 2020 Atmanirbhar package. It proposes spending 10% of GDP over ten years: 5% on infrastructure, 3% on the care economy, and 2% on green energy. This investment could create 38.6 million jobs, representing 8.2% of the labor force.

Funding the IGD would require a wealth tax of approximately 1.7%, which could decrease to 1.3% by 2032 due to the projected rise in wealth of the Indian elite. This approach aims to showcase India as a leader in climate action while addressing socioeconomic disparities.

Data used in the article

Expenditure Data

Type: Quantitative data on spending patterns across different commodities by the Indian elite and average Indians. Extent: Current spending patterns. Dimensions: Ratios of expenses on various commodities, differentiated between the Indian elite and average Indians. Essential: Yes, to show the consumption patterns driving carbon emissions and justify the wealth tax.

Carbon Emission Data:

Type: Quantitative data on per capita carbon footprints. Extent: Comparative analysis of the top 10% of Indian population vs. an average Indian and a first-world citizen. Dimensions: Carbon emissions by consumption categories such as housing, industrial goods, transport, and clothing. Essential: Yes, to link wealth inequality with environmental impact and justify the green aspect of the IGD.

Projected Wealth and Tax Rate Data:

Type: Quantitative projections of wealth growth among the Indian elite and the corresponding declining wealth tax rate. Extent: Projections up to 2032. Dimensions: Wealth in million crores and tax rate percentages. Essential: Yes, to support the feasibility and sustainability of financing the IGD through a wealth tax.

Current encoding visuals

This chart shows the Carbon Emission of the top 10% of Indian population vs. an average Indian and a first-world citizen. Light Blue colour -> first-world citizen Dark Blue colour -> Indian elite (top 10%) Red Colour -> An Average Indian. img_01_dvd_04

This chart shows the ratio of the expenditure by an Indian elite to an average Indian img_02_dvd_04

This chart shows projected rise in wealth of Indian elite. img_03_dvd_04

This chart shows the projected decline in Tax Rate img_04_dvd_04

Problems with visual encoding

1) The colors used in the chart are similar and not easily distinguishable for all viewers, especially those with color vision deficiencies.

2) The axes are not labeled beyond the general title, which can make interpretation more difficult.

3) The axis labels are missing, which can make it difficult to interpret the chart.

4) The bar chart could benefit from a more visually appealing design with varying colors or patterns.

Redesigning the visuals

Informative Title and Axis Labels: Providing a clear and descriptive title along with appropriately labeled axes ("Expenditure (Million Crores)" for the y-axis and "Year" for the x-axis) enhances the viewer's understanding of what the chart represents.

Gridlines and Data Points: Adding gridlines and marking data points helps in better visualizing the trends and specific values, making the data easier to interpret and analyze. img_05_dvd_04

Enhanced Clarity and Distinction: The use of distinct colors (light blue for first world citizens, dark blue for elite Indians, and red for average Indians) clearly differentiates the data series, making it easier to interpret the trends and comparisons.

Detailed Annotations and Gridlines: Adding gridlines improves readability and precision, while annotations on data points provide immediate reference values, making the data more comprehensible at a glance. img_06_dvd_04

bhumikaxyz commented 1 month ago

About Me

Name: Bhumika Taneja Roll Number: 21f1006329

Original Article : Diseases with higher burden in Asia and Africa lack research funding

What is the author trying to convey with this story?

The author highlights the significant disparity in research funding and attention between neglected tropical diseases (NTDs) and more prominent diseases like COVID-19, HIV/AIDS, tuberculosis, and malaria. Despite the massive burden these diseases place on impoverished populations in tropical and subtropical regions, they receive substantially less funding and resources. This underfunding perpetuates a cycle of poverty and disease, causing long-term disabilities, social stigma, and economic burdens that hinder development and deter investment in treatments. The article underscores the urgent need for increased funding and attention to NTDs to break this cycle and alleviate the suffering of millions.

Key Points:

What data is he/she using to tell the story?

The following charts are included in the original Hindu article.

Plot 1: Research Funding by Disease in 2022 Screenshot 2024-07-26 224734

Type of Data: Categorical Data: Different diseases. Quantitative Data: Research funding amounts for each disease in 2022.

Extent of the Data: Temporal Extent: Data for the year 2022. Financial Extent: Funding amounts ranging from a few million dollars to over $4 billion.

Dimensions of the Data Diseases: List of diseases receiving research funding. Funding Amount: The specific amount of research funding allocated to each disease.

Gaps in the Data: The data only shows funding for 2022 without historical comparison.

Essential Data: Funding Amounts: Crucial for understanding the level of research investment in each disease. Disease List: Important to identify which diseases are prioritized in funding.

Plot 2: Research funding for different health technologies from 2007 to 2022. Screenshot 2024-07-26 224823

Type of Data: Funding Data: Research funding for different health technologies (vaccines, drugs, biologics, diagnostics & diagnostic platforms, basic research) over the period from 2007 to 2022. Statistical Data: The amount of funding allocated to various technologies related to disease research.

Extent of the Data: Temporal Extent: Covers research funding data from 2007 to 2022. Financial Extent: Funding amounts ranging from $0 to $5 billion.

Dimensions of the Data: Temporal Dimension: Yearly data points from 2007 to 2022. Financial Dimension: Funding amounts in billions of dollars. Technological Dimension: Different categories of research: vaccines, drugs, biologics, diagnostics & diagnostic platforms, basic research.

Gaps in the Data: Lack of Disease-Specific Funding: The graph does not break down the funding data by specific diseases, making it difficult to see how much is allocated to NTDs versus other diseases. Lack of Geographical Data: The graph does not show how the funding is distributed geographically, which could be relevant to understanding global research priorities.

Essential Data: Funding Trends: The trend in funding over time for different technologies is crucial for understanding research priorities and shifts. Comparison Across Technologies: Showing funding amounts for different research technologies helps highlight disparities and areas of focus.

I will be redesigning the second plot for the purpose of this assignment.

Data Encoding and Potential Improvements

Current Encoding

Problems with the Current Encoding

Suggested Improvements

Redesigned Plot

image

Plot Details

  1. Single Line Chart: All research funding data (Vaccines, Drugs, Biologics, Diagnostics, Basic Research) are plotted on a single line chart.
  2. Color Coding: Each type of research is represented by a distinct color.
  3. Legends: A legend is provided in the top-left corner to identify the lines.
  4. Annotations: An annotation highlights the funding surge for vaccines in response to COVID-19 in 2020.

Encoding Improvements

  1. Distinct Colors: The plot uses distinguishable colors for each research type, making it easier to differentiate between the lines.
  2. Annotations: Key events, such as the funding surge for vaccines in 2020, are annotated for better context.
  3. Clear Legends: Legends help in identifying which line corresponds to which type of research
SURAJARS commented 1 month ago

Name: Suraj ARS Roll Number: 21f1005229

Original Article : On unemployment in Indian States

Story of article in view of author

The article provides an analysis of unemployment in major Indian states, excluding Union Territories, using data from the Periodic Labour Force Survey (PLFS) of 2022-23. It focuses on individuals aged 15 and above and highlights the disparities in unemployment rates across different states. Goa has the highest unemployment rate at almost 10%, followed by other relatively wealthy states like Kerala, Haryana, and Punjab. The analysis reveals that states with a higher proportion of self-employment have lower unemployment rates, and more urbanized states tend to have higher unemployment rates due to fewer informal job opportunities. The link between education and unemployment is also explored, showing that states with a higher percentage of educated individuals, such as graduates, tend to have higher unemployment rates, possibly due to a mismatch between skills and job requirements or because graduates aspire to high-wage jobs that are not available in sufficient numbers.

Key Findings

  1. Goa has the highest unemployment rate at almost 10%.
  2. Other relatively wealthy states like Kerala, Haryana, and Punjab also have high unemployment rates.
  3. States with a higher proportion of self-employment have lower unemployment rates.
  4. More urbanized states tend to have higher unemployment rates due to fewer informal job opportunities.
  5. A link between education and unemployment is observed for example states with a higher percentage of educated individuals, such as graduates, tend to have higher unemployment rates.

Charts present in hindu article

Chart 1: Umemployment across Indian States Unemployment across Indian States

Type of Data: Quantitative data on Unemployment rates across Indian States 2022-23 Extent of Data: The dataset includes unemployment rates on comparing Indain states 2022-23 Dimensions of the Data: This data has Indian States and Unemployment rates in percentage Gaps in the Data: This article shows only umemployment rates in 2022-23 without historical comparison data. Essential data: Need to know education funding,Number of universities , Policy Schemes across Indian States.

Chart 2: Self Employment Vs Unemployment Self Employment vs Umemployment

Type of Data: A scatter plot data of self employment vs unemployment Extent of Data: The dataset includes comparison between self employment and unemployment in 2022-23 Dimensions of the Data: This data has unemployment rate and share of self employment rate Gaps in the Data: This article shows comparison between self employment and unemployment in 2022-23 without historical comparison data. Essential data: Need to know various divisions of self employment across Indian States.

Current Encoding

Horizontal Bar Graph : Each State represented with bar about unemployment rate Scatter Plot : Compares self employment and Unemployment then dots represents States

Problems with the Current Encoding

Horizontal Representation : The horizontal representation of Indian States, making it challenging to recognize them. Color Variation: Utilization of Single essential Color with higher rate utilizing dark and bring down rate utilizing light and probably won't be quickly recognizable Different colors : Each state with various colors to be used in scatter plot.

Suggested Improvements

Vertical Representation : The vertical representation of Indian States, making it easier to recognize them. Variety of Gradient Colors : Utilize a variety range that is recognizable and visually challenged well disposed.

Redesigned Plot

Major Indian States

Plot Details

Pie Chart: This reprsentation conveys top five states unemployment rates and rest of states fall under others classification. Color Coding: Each Indain state is addressed with distinct color. Legends: A legend is provided in the top rightcorner to identify each section.

trxpti commented 1 month ago

Assignment 4

Name - Tripti Arya Roll Number: 21f1005935

Link to the Original Article: Nepal’s treacherous skies : With 741 plane crash deaths, country ranks 11 of 207 nations

The Story Behind the chosen article Authors Vignesh Radhakrishnan and Jasmin Nihalani highlight Nepal's high plane crash fatalities despite low air traffic over country's air space. Nepal's mountainous terrain and rapidly changing weather make its airports notoriously dangerous. The recent Saurya Airlines crash, killing 18 people, brings the total fatalities to 741, ranking Nepal 11th out of 207 nations in fatalities per departures. Since 1996, Nepal has experienced 54 crashes, ranking 33rd globally. Despite being 78th in departures, Nepal's high fatality rate aligns it with countries like Nigeria and Pakistan. The article urges authorities to address these safety concerns to prevent further loss of life.

Key Findings

  1. Nepal has relatively low air traffic but a high number of fatalities from plane crashes.
  2. A recent Saurya Airlines crash resulted in 18 deaths, underscoring the ongoing issue.
  3. Nepal has had total of 741 plane crash fatalities, ranking 11th out of 207 nations in fatalities per total departures.
  4. Country experienced 54 plane crashes since 1996, ranking 33rd globally.
  5. The high fatality rate is due to Nepal's mountainous terrain and rapidly changing weather conditions.
  6. Nepal is similar to like Nigeria and Pakistan, which have low air traffic but high fatality rates.

Provided chart for better understanding Chart 1: Nepal and other countries in terms of Plane crashes

image

Chart 2: Nepal and other countries in terms of Total fatalities due to Plane crashes.

image

Description for Chart 1 and 2:

  1. This chart is having treemap representing hierarchical data where each rectangle corresponds to a country.
  2. The data points here showing frequency in terms number of plane crashes so far in given countries.
  3. The size of each rectangle indicates the relative number of plane crash and fatalities happened. Nepal with red color rectangle showing high fatality rate.
  4. In conclusion, hte chart shows a visual comparison of plane crash fatalities across different countries.

Chart 3: Fatalities in crashes against departures by air carriers in different countries

image

Description for the above chart: This chart is a scatter plot using Quantitative data, continuous in nature representing the relationship between the number of fatalities and the number of departures in various countries.

  1. The x-axis representing the number of departures, while the y-axis represents the number of fatalities.
  2. Each dot represents a country, with Nepal highlighted in red(with 741 fatalities and 637,307 departures) .
  3. Both axes use a logarithmic scale to accommodate a wide range of values.
  4. The chart shows that despite lower numbers of departures, some countries, including Nepal, have high fatalities, indicating a disproportionate rate of plane crash deaths relative to air traffic.

Chart 4: Number of air crashes and fatalities of each airline in Nepal

WhatsApp Image 2024-07-27 at 4 20 43 PM

Description for the above chart: This chart is a bar graph with Quantitative and categorical Data representing the number of plane crashes and fatalities across various airlines in Nepal.

1.The y-axis lists different Nepalese airlines.

  1. Each bar is divided into two segments: red representing the number of crashes and blue representing the number of fatalities.
  2. The chart allows for a visual comparison of crashes and fatalities among different airlines.
  3. Yeti Airlines and Tara Air have notably high numbers of crashes and fatalities compared to other airlines, indicating significant safety issues.

Problems and Improvements

  1. Treemap Issues: Clutter: Small rectangles make it hard to compare lower values. Detail Loss: Small countries or those with fewer crashes/fatalities might be invisible due to color encoding.

  2. Scatter Plot Issues: Color Encoding: Color distribution to other data points makes it harder to distinguish different data points other than the main one. Overlapping Dots: Similar values lead to indistinguishable data points. Logarithmic Scale: Confusing for viewers at a first glance unfamiliar with it.

  3. Bar Graph Issues: Segment Confusion: Red and blue segments may lack sufficient color contrast. Comparison Difficulty: Segmented bars complicate value comparison.

Improvements:

  1. Treemap: we can improve this by adding a color gradient and tooltips for better distinction and visibility.
  2. Scatter Plot: Applying transparency and interactive elements to reduce overlap and enhance exploration.
  3. Bar Graph: by Enhancing color contrast between segments, sort airlines by fatalities, and adding labels and annotations in order to make it more clear.

Redesign for given charts

1. Replacing Treemap(showing Fatalities data due to plane crashes) with Ordered Bar chart image Instead of using a treemap, we can use an ordered bar chart to present the underlying data more effectively. This chart displays only the top 25 countries with the highest number of fatalities in plane crashes over the past few years. The ordered bar chart helps to clearly show the rankings and removes less significant data points from the visual, making it easier to interpret. and the enhanced color gradients of bars making visual more eye catching.

2. Bar graph showing Fatalities over plane crashes image In my point of view this bar graph is a better choice compared to the one in the article because it clearly illustrates the ratio of fatalities to airline crashes that occurred in Nepal's airspace over the past few years. The graph provides a clear and concise message, making it easier to understand the extent of the fatalities in relation to the crashes.

Conclusion: The above article is crucial for raising awareness among authorities and stakeholders about Nepal's air accidents. Equally important, however, is presenting data visuals that stand out, create a more significant impact, and help drive improvements in aviation safety. with the help of effective data visualization we can highlight critical issues and trends, encouraging action to enhance safety measures and prevent future tragedies.

Kirupa-Krishan commented 1 month ago

Name: Kirupa Krishan G

Roll_No: 21f1006352

Story Overview:

The article discusses the significant increase in the cost of a home-cooked vegetarian meal (thali) in Maharashtra over the last five years compared to the relatively modest salary rise between salaried and daily wage labourers. The key point is the growing disparity between food costs and income, highlighting the strain on households, especially those with daily wages.

Source : Link

Data Used:

Type of Data:

Extent and Dimensions:

Gaps in the Data:

Data Details:

Data Encoding:

Problems and Improvements:

Problems:

Improvements:

Chart 1:

table1

Chart 2:

table2

Chart `#3:`

table3

Chart 4:

table4

Redesigned Charts:

Visualization of Price and Wage Changes(Interactive chart)

View the interactive chart on Flourish

Redesign Chart 1:

table_re1

Redesign Chart 2:

table_re2

Redesign Chart 3:

table_re3

Description of the Improved Visualizations:

  1. Scatter Plot: Cost of Commodities for 2 Thalis

    • Title: The table lists the commodities required to prepare two thalis and their retail prices in ₹.
    • Y-Axis Title: Item
    • X-Axis Title: Cost/Kg in Rupees
    • Legends:
      • Blue: 5 years ago
      • Purple: 1 year ago
      • Pink: March 2024
  2. Scatter Plot: Percentage Increase in Commodity Prices

    • Title: The table lists the commodities required to prepare two thalis and their retail prices in ₹.
    • Y-Axis Title: Item
    • X-Axis Title: Cost/Kg in Rupees
    • Legends:
      • Yellow: Increase from 5 years to 1 year
      • Green: Increase from 1 year to March 2024
  3. Line Graph: Average Monthly Salary/Wage vs. Cost of 2 Thali Every Day Per Month

    • Title: Average Monthly Salary/Wage vs Cost of 2 Thali Every Day Per Month
    • Y-Axis Title: Earnings/Cost in Rupees
    • X-Axis Title: Time Period (5 years ago, 1 year ago, As on March 2024)
    • Legends:
      • Blue Line: Average salary earnings of a person during the preceding calendar month from regular wage/salaried employment
      • Red Line: Average wage earnings of a person per month from casual labor
      • Purple Line: Cost of making two thalis every day for a month
    • Annotations: Highlighted points showing the percentage of income allocated for food costs.

Redesign Process:

To redesign the visualization, I first identified the key data points: the changes in the cost of a vegetarian thali and average wages over five years. I then created line graphs to depict these trends, making the changes more visually accessible. To highlight the percentage increases in commodity prices, I used bar graphs, which facilitated easier comparison of relative changes. Annotations and labels were added for clarity, providing immediate context. Consistent formatting, including units, legends, and labels, was maintained across all visual elements to ensure readability and enhance overall comprehension.

prashantjnvu commented 1 month ago

Name: Prashant Sharma Roll Number: 21F1004586

Title: Which topics are India’s researchers publishing papers on?

Data Source: https://www.thehindu.com/data/which-topics-are-indias-researchers-publishing-papers-on/article68410121.ece

1: Story the Author is Trying to Tell:

The author is analyzing and comparing the research focus of scientists from different countries, particularly India, US and China, over the last 20 years and the last five years. The story highlights the dominant research focus area; such as Health, AI, Clean/Green energy, Astronomy, Network and Communication, Social wellbeing and Nanotechnology, and discusses how these focus areas reflect the scientific and technological priorities of the countries involved. The analysis aims to show trends and provide insights into how different nations allocate their research resources and how these decisions align with global scientific challenges and opportunities.

2: Data Used to Tell the Story: Type of Data:

  • From the point of view of the intent of the story data lacks geographical extent (more number of countries) and segmentation of years (segments of multiple 5 years to understand the long term and short term research commitments.
    Essential Data:

3: Encoding and Problems: Encoding:

Existing Graphs: Chart 1 | The chart ranks the five topics under which the highest number of papers were published (2019- 2023) in select nations.

Last_5

Chart 2 | The chart ranks the five topics under which the highest number of papers were published (2004- 2023) in select nations.

Last_10

Graph Additions:

To make the intended story more impactful and easy to understand from comparision point of view:

(1) Visualize the comparisons of overall research output of countries in the last 5 years and last 2 decades per category. This visualization will help in critically compare –

  • Which country is contributing more historically and recently? (India is contributing least among US, China and India)
  • How does the last 5 years overall research paper contribution stand in comparison to the last 2 decade contribution? (China is producing lot more research paper faster than its previous rate)

1

(2) Visualize country-wise contributions in research output per research category of top 5 research areas in last 5 years and last 2 decades. This visualization will help in critically compare –

  • Is particular country’s focus is narrow or diversified? (in last 5 years diversified for India)
  • Comparative (country-wise) research output per country and category.

2

(3) Visualize the comparisons of overall research output per top 5 research area in last 5 years and last 2 decades. This visualization will help in comparing critically –

  • Which research topics survived from the year prior to 2019 and comparisons of it from last 5 year publication to gauge change in focus, if any? AI, Clean/Green Energy, Health (overall focus areas are diversified in last 5 years) and Nanotechnology (overall focus area has slimmed down in nano technology in last 5 years)
  • Which research topics lost the focus? – Astronomy, Networks and Communication and Social well being.

3

(4) Visualize the qualitative comparisons of overall research publication per category per country in the last 5 years and last 2 decades as heatmap. This visualization will help in comparing critically –

  • Which country is publishing at what scale per research category ? (Clean/Green energy research focus of china in last two decades but recently it get low focus in this field, Health has always been a focus area for US)
  • Which research areas have become out of favor recently or have come into focus recently ? (In the last 2 decades AI was not in the top 5 research focus areas but in the last 5 it is for US and India.)
  • How are the countries stacked in terms of publishing count (Qualitatively) for common top 5 research areas ? 9India’s publishing count is lowest among US and China in areas - Health, AI, Clean/Green energy and NanoTechnology)

4

(5) Visualize the publication contributions as % contribution in the research categories of top 5 research areas in the last 2 decades and in the last 5 years. This visualization will help in comparing critically –

  • Which research area is overall favored ? (AI has outshone the publication in the last 5 years as compared to 2 decade count of publication, More research papers have been published at higher rate in the field of Health, Nano Technology publications have been rather low.)
  • Which research areas have become out of favor recently ? ( In the last 5 years publication count in Astronomy, Networks and communication and Social Wellbeing have not been favored)

5

irshad747 commented 1 month ago

Name : Irshad Sareshwala Roll Number : 21f1004835

Original Story :2024 polls: How people in high and low income areas voted in Chennai’s Mylapore, T Nagar and other areas(https://www.thehindu.com/data/2024-polls-how-people-in-high-and-low-income-areas-voted-in-chennais-mylapore-t-nagar-and-other-areas/article68427083.ece)

  1. What is the story the author is trying to tell?

The original story aims to analyze the voting patterns in Chennai's Mylapore, T Nagar, and other areas based on income levels. It shows that the DMK has a stronghold among urban poor voters, while the BJP has better support among wealthier voters.

  1. What data is being used to tell the story?

The data used in the story includes:

Polling station data listing areas and polling stations. "Form-20" data from the Election Commission showing party-wise votes polled in each polling station. Guideline values of streets/areas as a proxy for wealth/income. Details of the data:

Type of data: Quantitative (votes, guideline values) Extent of the data: Data from the 2024 Lok Sabha elections for Chennai's three Lok Sabha seats. Dimensions of the data: Vote shares by party, guideline values by street/area. Gaps in the data: Potential inaccuracies in using guideline values as a sole indicator of income. Essential data: Vote shares, polling station details, guideline values. Irrelevant data: Additional demographic details not directly related to voting patterns.

  1. How is it encoded, what problems are with it, and how have you attempted to improve it?

Original Encoding:

The original visualization uses scatter plots with red and blue dots representing DMK and BJP vote shares, respectively, across different streets/areas. Streets/areas are arranged based on their guideline values from high to low. Problems:

The scatter plot may not clearly show the relationship between income and voting patterns. The use of only two colors may not be sufficient to differentiate between multiple data points in a small area. Lack of interactivity to explore specific data points in detail. Improvements:

Use a more intuitive visualization method, such as a bar chart or heatmap, to show the correlation between income levels and vote shares. Add interactivity to the visualization to allow users to hover over data points for more details. Incorporate additional datasets, such as demographic information, to provide more context to the voting patterns.

Redesigning the visualization: Screenshot (455) Final Visualization: The redesigned visualization uses a dual-axis bar chart to show vote shares by area and guideline values. It includes interactive elements to provide detailed information about each data point.

Conclusion: The redesigned visualization effectively conveys the relationship between income levels and voting patterns in Chennai, enhancing clarity and interactivity compared to the original scatter plot.

21f1006304ds commented 1 month ago

Name : Rajesh Saha Roll No. 21F1006304

Subject : A green wealth tax in Budget 2024

Article Link :

A green wealth tax in Budget 2024

Author's story:

The author is proposing a green wealth tax for Indian Elite (top 10% of Indian Population in terms of CO2 emission). The author has also shown that this tax would decline over time, still the target would be achieved. To support this proposal, the author has shown 3 cases - 1) CO2 emission by Indian Elites comparing with developed countries and Indian average population, 2) How much money India needs to tackle IGD (Indian Green Deal), 3) How this can be achieved.

Data used by author's :

The author has used numerical data (in USD), categorical data (Elite Indians, Average Indians), Ratio (CO2 emission). The source of the data was not mentioned in the article.

Data Collection for redesign:

Data Collected from the above links and visualization. Data has been collected manually by hovering mouse at different data points.

Redesign attempt

For redesigning purpose, all the above visualizations would be visited and then modifications would be done with the same data.

1. CO2 emission of Indian Elites comparing with developed countries and Indian average population

The author has divided this into 2 parts - A) Comparison of CO2 emission among developed country, Indian Elite and Average Indian Elite, B) The sector-wise comparison of CO2 emission between Indian elite and average Indian population.

1A. Comparison of CO2 emission among developed country, Indian Elite and Average Indian Elite The author has used line chart as below. image

In the above, the story is clearly coming that how Indian elite is catching with the emissions in developed countries and how that is higher than average Indian. So, I have redesigned this in the same line. I have used line chart with more grid lines, properly adding the legends and also provided a selectable search box so that, if someone wants to hide one or more lines and wants to see only few charts, (s)he can do that. I have also chosen the color differently such that red has been chosen for Indian Elites CO2 emission which we need to tackle with, where as the emission by average Indian is not increasing at higher rate, so that is colored as green, the CO2 emission for developed country is declining, so that is marked in blue (eg, the standard color for Electric Vehicle logo is Blue). image

1B. The sector-wise comparison of CO2 emission between Indian elite and average Indian population Here, the author has shown the ratio of CO2 emission by average Indian and Indian elites. image

There are 2 problems here - 1) For ratio, the author has chosen different denominator for different sectors, 2) the relative comparison is not clearly shown. As a result, the sector-wise comparison is not coming out clearly (eg, it looks like Housing has the maximum contribution) and no clear comparison between Indian average and Elite is coming out.

I have redesigned this as a column chart after normalizing the base at 1, ie, all the ratio has been shown as 1:x, where 1 is the emission by average Indian and x is the emission by elite Indian.

image

This redesigned chart shows that "Health and Education" sector is the highest contributor by elite Indians in terms of ratio for the same by average Indian.

2. How much money India needs for IGD

The author has presented this in 2 different donut charts showing the expected investment money and the employment created. The problems with this design are that - 1) one has to refer 2 different charts for corelating, that is, there is no implicit co-relation is appearing in the chart, 2) color chosen could have been better.

image

image

To redesign this, I have chosen bubble chart, where the investment money is being shown in X axis, job/employment created is shown on Y-axis. Along with that, more relevant color has been chosen, the size of the circle will give idea about employment created.

image

3. How can this be achieved

The author has projected the expenditure by Indian elites between the year 2023 and 2032 and shown that in line chart. In another line chart, the author has shown a declining rate of proposed wealth tax for the same period. Here also, the author has used 2 different charts and to corelated, one needs to manually observe these charts. Also, this does not show the relation with the money collected as tax.

image

image

To redesign this, I have created bubble chart with varying size and color of dots. image

This single chart shows 1) The declining tax rates for year 2023 to 2032, 2) The increasing bubble sizes shows the increasing expenditure by Indian elites, 3) The color of dot (Red to Green) shown that the money collected is increasing.

The popup at each data point will show the year, tax-rate, expenditure, money collection.

So, in this proposed concluding chart we can show that the wealth tax is justified and that can be reduced over years. Also, despite the reduced tax rate, the money collection will increase as the expenditure by Indian elite will also increase.

Note:

The visualizations were created using flourish. Visualizations are available in following links.

https://public.flourish.studio/visualisation/18879865/ https://public.flourish.studio/visualisation/18883275/ https://public.flourish.studio/visualisation/18880429/ https://public.flourish.studio/visualisation/18882520/

Arvind-Gunasekaran commented 1 month ago

Name: Arvind Gunasekaran Roll No: 21f1001014 Email: 21f1001014@ds.study.iitm.ac.in

Assignment 4 - REDESIGNING DATA STORY

ARTICLE: India no longer has more losses than wins in Test cricket

(https://www.thehindu.com/data/india-no-longer-has-more-losses-than-wins-in-test-cricket-data/article67945758.ece)

I. Story the Author is Trying to Tell The author illustrates the evolution and improvement of the Indian cricket team's performance in Test matches over time. The main points include:

  1. Overall Performance Improvement: India's Test match performance has significantly improved, especially in the last few decades, making them a competitive team both at home and away.

  2. Home vs. Away Performance: India has historically been stronger at home, dramatically improving their away performance in the 2000s.

  3. Key Milestones: Specific periods marked significant changes in India's performance, such as the dominance at home in the 1990s and the improved away performance in the 2000s and beyond.

  4. Current Standing: India’s current win-loss ratio is 1.00, marking a balance between wins and losses in Test cricket.

II. Data Used to Tell the Story

  1. Win-Loss Ratios of Different Teams The data on win-loss ratios of different teams is fundamentally comparative, offering a statistical look at how various international teams have fared in Test match cricket. image
  1. Cumulative Wins, Losses, Draws for India This dataset tracks India’s cumulative wins, losses, and draws over a substantial timeline, from the nation's first Test match in 1932 up to 2024. image
  1. Rolling Averages of Win/Loss/Draw Percentages The rolling averages approach uses sets of 83 Tests to smooth out short-term fluctuations and reveal longer-term trends in win, loss, and draw percentages. image
  1. Home vs. Away Performance This comparative dataset presents a detailed look at India's Test cricket performance both at home and away over different decades, spanning from the pre-1990s era to the 2020s.

Conclusion: These in-depth observations, analyses and improvements contribute to a comprehensive redesign of the data story on the performance of The Indian Cricket Team in Test Cricket through history.

Thanks, Arvind Gunasekaran 21f1001014

srinivesh commented 1 month ago

Name: S R Srinivasan Roll No: 21f1002966 Email: 21f1002966@ds.study.iitm.ac.in

By-polls: an indication of a new anti-incumbency

Original article: https://www.thehindu.com/data/by-polls-an-indication-of-a-new-anti-incumbency/article68413953.ece

What is the story the author is trying to tell?

• Soon after the results of the Lok Sabha (LS) elections were announced, bypolls to 13 Assembly Seats (AS) were held across seven states • In 12 of the 13 seats, the contest was primarily between the NDA and I.N.D.I.A blocks; in the seat in Bihar, and independent candidate won • The author takes a position that the change in the vote share between the two elections is an indication of a new anti-incumbency • He uses a simple analysis to derive this data It is said that in politics and economics, it is possible to write the conclusions and use the data to support the conclusions. My analysis provides a critique of the data story from this context.

What is the need for the analysis?

• The diversity of India poses a challenge in conducting quantitative and qualitative analysis on the people’s perception of the policies of the government • There are frequent election – almost one in every 6 months. The results of the elections are often used as the proxy data • In 12 of the 13 seats, the contest was primarily between the NDA and I.N.D.I.A blocks; in the seat in Bihar, and independent candidate wonThe author

Understanding the data that powers the story

What data he/she is using to tell the story? Describe its details -- type of data, extent of the data, dimensions of the data, gaps in the data, what data is essential and what is irrelevant.

Type of Data:

Constituency-wise data published by ECI

Extent of Data:

The data spans a few months in 2024, covering LS elections and AS bye-elections. The geograhpical extent is poor since only 13 AS seats are being looked.

Dimensions of the Data:

Each basic row represents the performance of an alliance in the constituency. Most of the features are categorical, with the vote share features being numeric

Encoding of the Data:

Due to the limited number of features, the data is presented in the tabular form, without any visual encoding.

Essential Data:

The essential data includes the State (for easier reading), Name of AS constituency, Party (to distinguish between alliance constituents), Vote share in the two elections being considered. The derived data is the change in vote share percentage between the two elections. With the table itself being small, no data is irrelevant.

How is it encoded, what problems are with it, and how have you attempted to improve it?

Gaps in the Data:

This would be discussed more in later sections. The main gap is the low extent of the data. An extension to the previous LS and AS elections would provide a more representative analysis.

Table 1 | The table shows the NDA parties’ vote share in the 2024 Lok Sabha elections and the Assembly elections.

image

Source of image: https://www.thehindu.com/data/68414003-Chart-1-bypolls.svg

Table 2 | The table shows INDIA parties’s vote share in the 2024 Lok Sabha elections and the Assembly elections.

image

Source of image: https://www.thehindu.com/data/68414006-Chart-2-bypolls.svg

Conclusions by the Author

Based on the tables, the author concludes that “the overall trend that emerges from the bypoll results in 13 Assembly seats indicates a sharp decline in the BJP/NDA’s vote shares from the Lok Sabha polls held less than two months ago, as well as improvements for the INDIA parties.”

He further provides the following subjective conclusions (as quoted).

“ • Prime Minister Narendra Modi needs to introspect whether his government’s hubristic response to the 2024 Lok Sabha verdict, which resulted in a net loss of 63 seats for the BJP, is leading to the alienation of a significant section of the electorate. • Political arrogance reflected in the continued persecution and vilification of the parliamentary opposition and wanton violation of human rights • Pompous exaggerations of vacuous economic achievements and a deceptive denial of growing economic hardship and disparities; and • Manipulative social re-engineering which undermines social justice and solidarity have become the hallmarks of the Modi regime.

The electorate seems to be losing its patience. “

Potential improvements using external data source

It is well known that the Indian electorate makes different choices between LS and AS elections. As an example, the state of Odisha holds simultaneous elections and the vote share of the parties has been different. To put this in a graphic way, the same elector pushes one button with her left hand (for the LS election) and a different button with her right hand (for the AS election). It is beyond the scope of this assignment to list the reasons for this. It suffices to say that the author himself acknowledges this fact in a previous article analysing the AS elections in 2023.

For data analysis, it is trivial to see that extending the series to more LS and AS elections would provide a more complete picture. I did this by adding a feature on the vote share in the previous assembly elections – this would have been over different years. Due to lack of time, I could not add AS wise results from the 2019 LS elections – this data is often found in individual state Election Commission pages.

I encoded the data as a grouped bar chart, with AS constituencies in the Y-axis. It Is easy to see that the trend is quite unclear over the 3 elections. The data is smal enough to be encoded as a simple table; I chose a more visual encoding.

3-election vote share data for NDA Bloc

https://public.flourish.studio/visualisation/18884689/ Vote Share across election for NDA Bloc

3-election vote share data for I.N.D.I.A Bloc

https://public.flourish.studio/visualisation/18884884/

3-election vote share for I N D I A Bloc

Partial conclusions from extended data

No clear conclusion is posisble even from the extended data. Even if the improvements in the next section are addressed, the limited geographical extents makes it impossible to extrapolate a country wide, or even a a state wide trend from the data. In parrticular, there is no data to suppor a 'new anti-incumbency' as claimed in the title of data point analysis.

Further improvements in the analysis

Souvikx2 commented 1 month ago

Name : Souvik Bhattacharjee Roll number : 21f1003742 Link to Story: https://www.thehindu.com/data/neet-ug-2024-data-reveals-top-cities-for-high-scoring-candidates-crucial-for-government-medical-college-admissions/article68441411.ece

Main story: The story talks about the distribution of selected candidates with score of 650 and above for NEET 2024, the general Medical entrance examination , among the Indian cities. The charts give an insights of how few cities hold the majority of the candidates who have excelled in the examination, further, it is also been used by students and parents alike to show the possibility of unfair means used in the exam.

Data used: Type of Data: Quantative data used. Extend of data: All Indian NEET 2024 candidate score. Dimension of data: Candidate cities, Scores, test centers and states. Gaps in data: No detail about retest takers, people who have not attended coaching, gender and income brackets. Relevance: The data includes cities and score which in turn can be used to highlight the concentration of 650+ achievers.

Current Visual Representation:

image

image

Critics: The colour coding used for the graph fades away into transparency makes it harder for the reader to see. It is also confusing that each dot infront of the state is a city unless hovered upon. The selection of the visual also fails to properly translate the importance of the story. The visual also contains much redundant information. If the authors purpose was to highlight the cities with most number of 650+ scoring candidates, it should have been limited to the top cities only.

Redesign: A donut chart would be used containing only the top cities. This will reduce the redundant information of the visual and highlight the number of cities which are directly contributing to the number of toppers. image

harshymehta14 commented 1 month ago

Name - Harsh Y Mehta Roll no - 21f1001295

Assignment-04

Title - Personal loans disbursed via digital apps have the highest share of overdue accounts Source Link Published on - 04 July 2024 Author - Vignesh Radhakrishnan

Overview Until the mid-2010s, banks lent massive loans to big industries. When these businesses failed, the bad loans went unnoticed until a 2015 RBI review revealed that 10% of loans were bad by 2017. Various recovery methods, including the Insolvency and Bankruptcy Code, 2016, were used to recover these loans, and the issue was widely publicized.

As a result, banks reduced loans to industries and recovered more bad loans, reaching a healthy state in 2024 with a decadal-low Gross Non-Performing Assets (GNPA). However, they shifted focus to retail loans, such as personal loans and credit cards, which grew significantly. Despite regulatory measures, the GNPA ratio for personal loans fell to 1.2% in March 2024.

The RBI's Financial Stability Report highlights concerns about rising slippages and high delinquency levels among small borrowers with personal loans below Rs. 50,000, especially from NBFC-Fintech lenders. These issues indicate potential future problems, with the RBI now worried about individuals rather than industries.

Glossary

  1. NPA (Non Performing Assets) - is the share of total loans that are overdue for more than 90 days.
  2. GNPA (Gross Non-Performing Assets) - represents the total amount of loans that are classified as non-performing without any deductions.
  3. NNPA (Net Non-Performing Assets) - derived by subtracting provisions (reserves set aside for potential losses) and recoveries from the GNPA.
  4. Slippages - are fresh additions of bad loans in a year.
  5. Delinquencies - minor crime.
  6. NBFC (Non Banking Finance Company) - is a financial institution that offers banking services such as loans and credit facilities but does not hold a banking license and cannot accept demand deposits from the public.

Full Forms

  1. PVBs (Private Sector Banks)- Example: HDFC Bank
  2. PSBs (Public sector Banks) - Example: SBI Bank
  3. SFBs (Small finance Banks) - Example: IDFC Bank

Key Takeaways

  1. In 2015, the Reserve Bank of India (RBI) carried out a review, following which skeletons tumbled out of the closet. The share of bad loans reached as high as 10% in 2017, which meant that nearly one in every 10 loans had turned bad.
  2. The latest Financial Stability Report (FSR) of the RBI shows that Gross Non-Performing Assets (GNPA) was at a decadal-low in March this year (2024).
  3. The GNPA ratio of personal loans has been reducing consistently reaching 1.2% in March 2024 — the lowest across sectors and within the segment (Agriculture, Industry, Services, Personal Loan).
  4. In FY24, slippages from retail loans (excluding home loans) formed 40% of fresh additions of NPAs.
  5. Delinquency levels among small borrowers with personal loans below Rs. 50,000 remain high.

Charts Chart 1 | The chart shows the Gross non-performing Assets (GNPA) and NPA across years. image

Chart 2 | The chart shows the GNPA (in %) across sectors. image

Chart 3 | The chart shows the bank-type wise split of the share of slippages from retail loans in the overall new additions of NPAs. The chart excludes slippages in home loans. Slippages are fresh additions of bad loans in a year. image

Chart 4 | The chart shows the bank type-wise delinquency levels for personal loans below Rs. 50,000. image

Redesign Chart

Slippage& Delinquency across Banks type-wise image Disclaimer - The data in above chart is an approximation and may not be accurate,

The combined charts allow for easier comparison of related data in a single view. They save space and make the information clearer and more readable. This improves efficiency, making data analysis quicker and more straightforward.

trishulam commented 1 month ago

Name: N K Vamsi Krishna

Roll_No: 21f1003596

Story Overview:

The article discusses the significant shifts in voter support during the 2024 Assembly bypolls across 13 constituencies, highlighting a potential trend of anti-incumbency. The data indicates a notable decline in vote shares for the BJP-led NDA coalition and gains for the opposition INDIA bloc, suggesting a shift in voter sentiment that could influence future elections.

Source: The Hindu

Data Used:

Type of Data:

  1. Vote Share Data: Percentages of votes obtained by various parties (BJP, NDA, INDIA bloc) in both the 2024 Lok Sabha elections and the subsequent Assembly bypolls.
  2. Election Results: Specific seat outcomes for each party in the bypolls.
  3. Comparative Analysis: Changes in vote shares between the Lok Sabha and Assembly bypolls, focusing on the decline or increase in support for each party.

Extent and Dimensions:

  1. Temporal Extent: 2024 data from both the Lok Sabha elections and the Assembly bypolls.
  2. Geographical Extent: 13 Assembly constituencies across seven states in India.
  3. Metrics: Vote shares by party, changes in vote shares, election results by constituency.

Gaps in the Data:

  1. Regional Specificity: Data is limited to 13 constituencies, which may not represent broader regional or national trends.
  2. Detailed Local Context: Lack of detailed local issues or events that may have influenced voter behavior.
  3. Temporal Gaps: Only 2024 data is used, without historical comparison beyond the immediate election cycle.

Data Details:

  1. Essential Data: Vote shares by party, average gains/losses, seat outcomes.
  2. Irrelevant Data: Non-election related data or broad economic indicators not directly tied to the analysis.

Data Encoding:

  1. Tables and Charts: Original data presented in tables showing vote shares and seat outcomes.
  2. Narrative: Explains the methodology and findings.

Problems and Improvements:

Problems:

  1. Visualization: Original data is presented solely in tables, which does not effectively illustrate the trends in voter support changes.
  2. Engagement: Static tables lack the interactivity and visual impact needed to engage readers and highlight significant trends.

Improvements:

  1. Visualization: Incorporate charts and graphs to visually represent vote share changes and seat distribution, highlighting the percentage changes and relative differences over time.
  2. Interactivity: Use interactive visualizations to enhance reader engagement and understanding.

Original Visualizations:

Visualization 1:

image

Visualization 2:

image

Redesigned Visualizations:

Interactive Visualizations - Flourish

Visualization 1: Vote Share Gains and Losses in 2024 Assembly Bypolls

Visualization: chart3

Explanation: This bar chart compares the changes in vote shares for the NDA and INDIA alliances across 13 constituencies during the 2024 Assembly bypolls. Each bar represents the gain or loss in vote share percentage points for a constituency.

The chart clearly illustrates the trend of declining support for the NDA and rising support for the INDIA alliance across most constituencies, highlighting significant shifts in voter sentiment.

Visualization 2: Seat Distribution in the 2024 Assembly Bypolls

Visualization: Chart2

Explanation: This donut chart shows the distribution of seats won by each party in the 2024 Assembly bypolls. Each segment of the donut represents the number of seats won by a party, providing a clear and immediate visual summary of the election results.

The chart effectively highlights the competitive nature of the bypolls, with seats distributed across multiple parties, indicating a diverse political landscape.

Conclusion

The redesigned visualizations provide a clearer and more engaging way to understand the data story of the 2024 Assembly bypolls. The vote share comparison bar chart highlights the significant shifts in voter support, while the seat distribution donut chart gives a quick overview of the outcomes. These visualizations help to communicate the broader narrative of potential anti-incumbency sentiment and its implications for future elections.

abirChakrabortyIITM commented 1 month ago

Name: Abir Subroto Chakraborty Roll No: 21f2000280


Title: By-polls: an indication of a new anti-incumbency (Link)

Author: Prasenjit Bose


Based on the data provided in the images and the brief overview from the article, here’s the analysis:

1. What is the story the author is trying to tell?

The author is highlighting a significant shift in voter preferences, indicating a decline in the vote share for the BJP-led NDA in recent bypolls, while the opposition INDIA bloc has gained considerable ground. This shift is seen as a possible indication of growing anti-incumbency sentiment against the BJP.

2. What data is used to tell the story?

The data consists of vote share percentages for both the NDA and INDIA bloc parties in various constituencies during the 2024 Lok Sabha elections and the subsequent 2024 Assembly bypolls. The key elements include:

Table 1: NDA Performance

This table details the vote share of the NDA (BJP and JDU) in various Assembly Constituencies (ACs) across multiple states in the 2024 Lok Sabha (LS) elections and the subsequent 2024 Assembly (AS) bypolls.

image

Table 2: INDIA Performance

This table presents the vote share of the INDIA bloc (including AITC, INC, AAP, VCK/DMK, and RJD) in the same constituencies and elections as Table 1.

The tables illustrate a notable shift in voter preference from the BJP-led NDA to the opposition INDIA bloc in recent bypolls compared to the Lok Sabha elections. The data indicates significant gains for the INDIA bloc across multiple states and constituencies, suggesting a growing anti-incumbency sentiment against the BJP.

image


3. How is it encoded, what problems are with it, and how have you attempted to improve it?

In summary, the data underscores a notable shift in voter sentiment against the BJP, with the INDIA bloc gaining traction, pointing to possible national implications for future elections.


Conclusion by the Author

The recent bypoll outcomes signify a notable decline in support for the BJP-led NDA, with significant vote share losses across multiple states, contrasting the gains made by the opposition INDIA bloc. This trend suggests a potential shift in the national political landscape, possibly reflecting growing dissatisfaction with the BJP. The author emphasizes that while local factors may play a role, the overall decline in the BJP’s vote share across various constituencies points to a broader anti-incumbency sentiment. The opposition's gains indicate a possible change in voter mood, favoring the INDIA bloc in upcoming elections.

Ashrey30 commented 1 month ago

Name: Ashrey Roll No: 21f2000448

Article: A Green Wealth Tax in Budget 2024

Story the Author is trying to tell

The author is presenting the idea of a wealth tax-financed Indian Green Deal (IGD) that aims to address climate change, inequality, and unemployment. The story argues that the wealth tax on the Indian elite can fund a comprehensive green energy, infrastructure, and care economy program, ultimately generating millions of jobs and reducing carbon emissions.

Data Analysis

1. Per Capita Carbon Footprint (Chart 1):

Screenshot 2024-07-28 172722

2. Expenses and Carbon Intensity of Commodities (Chart 2):

Screenshot 2024-07-28 172733

3. Expenditure and Employment Generation (Chart 3a & 3b):

Screenshot 2024-07-28 172751 Screenshot 2024-07-28 172802

4. Projected Wealth and Tax Rate (Chart 4a & 4b):

Screenshot 2024-07-28 172815 Screenshot 2024-07-28 172824

Gaps in the Data

Essential vs. Irrelevant Data

Redesigned Visualizations

1. Per Capita Carbon Footprint

2. Expenses and Carbon Intensity

3. Projected Expenditure and Wealth Tax Rate

4. Expenditure and Employment Generation

Final Redesign

DVD_GA5_1

muskansindhu commented 1 month ago

Redesigning the NEET-UG 2024 Data Story from The Hindu Data Point

Name: Muskan Sindhu
Roll No: 21f1003710

Original Article: Select “coaching hubs” are host to many high scoring NEET-UG-2024 candidates

Original Story

The original article highlights the cities and centers with the highest share of students scoring 650 or more in the NEET-UG 2024 exam. The main focus is on the exceptional performance in specific cities, particularly Sikar in Rajasthan, and the implications of these high scores for securing admissions in government medical colleges.

Story Analysis

The author aims to highlight the top-performing cities and centers in the NEET-UG 2024 exam and discuss the implications of these scores for medical college admissions. The data used includes quantitative data on NEET-UG scores, segmented by city and center, with percentages of students scoring above specific thresholds (650 and 700 marks). However, the article lacks historical comparison data and does not delve deeply into the reasons behind the high scores in specific centers. Essential data include scores by city and center, percentages of high scorers, and absolute numbers of high scorers.

Screenshot 2024-07-28 at 7 42 56 PM
fig 1.1. Original scatter plot showing the percentage share of students who scored over 650 marks by state.

Screenshot 2024-07-28 at 7 34 19 PM
fig 1.2. Table with centers that have the highest share of students scoring above 650 marks.

Visual Encoding and Improvements

The current visual encoding includes a table listing the top centers with the highest share of students scoring above 650 and a scatter plot showing the percentage share of students who scored over 650 marks by state. The scatter plot may be overwhelming due to the large number of data points, and the table lacks visual appeal and could be enhanced with better design elements. Additionally, the context and significance of the data points could be explained better. To improve this, I propose simplifying the scatter plot to focus on the top states and adding annotations for clarity. The table should be enhanced with visual elements like bar graphs to show comparisons more clearly. Providing historical comparison data would add context to the current year's results.

Screenshot 2024-07-28 at 8 04 06 PM
fig 2.1. Simplified scatter plot focusing on the top-performing states with the highest share of students scoring above 650 marks.

Screenshot 2024-07-28 at 8 02 47 PM
fig 2.2. Simplified bar chart showing the top centers with the highest share of students scoring 650 and above in NEET-UG 2024.

Redesign Process

To enhance the visualizations, I have focused on improving the scatter plot and the table visualization. For the scatter plot, I focused only on the top-performing states, using color coding to differentiate them and adding annotations to highlight significant data points. A clear legend was used to explain the color coding. For the table visualization, I converted it into a bar chart where each bar represents the percentage of students scoring above 650, with a secondary axis showing the absolute number of students scoring above 650. Color coding was used to differentiate between cities and centers.

Final Notes

The redesigned visualizations provide clearer insights into the distribution of high NEET-UG scores across various centers and states. By using annotations, clear legends, and contextual data, the visualizations aim to make the story more compelling and informative. This assignment helped me understand the importance of clear and effective data visualization, aiming to make the data more accessible and engaging for the audience, providing them with a better understanding of the NEET-UG 2024 results.

Indu16910 commented 1 month ago

Name: Indumathi Kalla roll no: ce22b062 Design Process Documentation What is the story the author is trying to tell? The author of the original article aims to provide a comprehensive and visually engaging breakdown of the Indian Budget 2024-2025. The story seeks to highlight the allocation of funds across various sectors and track the growth and changes in these allocations compared to previous years.

What data is used to tell the story? Type of Data:

Quantitative data representing the budget allocations in monetary terms. Percentage data showing the share of each sector in the total budget. Comparative data from previous budgets to illustrate growth or decline. Extent of the Data:

The entire budget for the fiscal year 2024-2025. Historical budget data for trend analysis. Dimensions of the Data:

Sector names. Allocation amounts. Percentage shares. Year-on-year growth. Gaps in the Data:

Specific details about sub-sector allocations might be missing. Potential lack of granularity in showing the impact of allocations on outcomes. Essential vs. Irrelevant Data:

Essential: Sector names, allocation amounts, percentage shares, historical data for comparison. Irrelevant: Overly detailed sub-sector data that doesn’t contribute to the main narrative. How is it encoded, what problems are with it, and how have you attempted to improve it? Original Encoding:

The original article used heat maps to visually represent the data. Problems with Heat Maps: Clarity: Heat maps use color intensity to represent values, which can make it difficult to precisely identify the exact allocation amount for each sector. Comparison: It's challenging to compare sectors directly using heat maps, as similar color shades can be hard to differentiate, especially for viewers with color vision deficiencies. Labeling: Heat maps often lack clear labeling, making it necessary for viewers to refer to legends, which disrupts the flow of understanding the data. Your Improvements:

Pie Chart for Industry Percentage:

Clarity: Redesigned the budget allocation visualization using a pie chart to show the percentage share of each industry in the total budget. This provides a clear and immediate understanding of the distribution. Direct Comparison: Each sector's share is represented as a slice of the pie, making it easy to compare the sizes of different sectors directly. Enhanced Labeling: Added data labels directly on the pie chart, ensuring that viewers can quickly grasp the percentages without needing to refer to a legend. Stacked Growth Bars for Trend Visualization:

Trend Analysis: Used stacked growth bars to represent the changes in budget allocations over time. This helps in understanding both the individual and cumulative growth of different sectors. Precise Information: Added data labels to each segment of the stacked bars, providing precise information on the amount and percentage of growth or decline.

Scope Expansion: Included additional data from previous years to provide a more comprehensive view of trends and changes in budget allocations. Scope Curtailment: Focused on the most significant sectors to avoid overwhelming the viewer with too much information at once. Visualizations Original Visualization Samples: https://www.thehindu.com/data/analysis-of-union-budget-2024-sector-wise-impact/article68446110.ece Your Redesign Iterations: Budget expenditure in FY25 (in crores)

Change in sectors share in total expenditure from FY24RE (% points)

Social welfare, MGNREGA and Samagra Shiksha 2017-24 (% of budget)

POSHAN, Old age pension, Widow pension scheme and Ayushman Bharat 2017-24 (% of budget)

Swasthya suraksha, MORTH, Telecom department and Power 2017-24 (% of budget)

AMRUT, smart cities and NHAI 2017-24 (% of budget)

Railway Ministry, Signalling   Telecom and Aviation Ministry

UDAAN and shipping ministry

FAME, PMAY-U and PMAY-R

Agriculture, PMFBY, PMKISAN 2017-24 (% of budget)

Space, Science and Tech, Space Technology, Space applications 2017-24 (% of budget)

Health, Rural development, Higher education, School education 2017-24 (% of budegt)

Defance 2017-24 (% of budget) (1)

Pie Chart Stacked Growth Bar These visualizations should be embedded inline with your documentation to provide a coherent narrative and a clear comparison between the original and redesigned versions.

Afringowhar commented 1 month ago

Name: Syed Afrin Gowhar Roll No. : 21f2001140

Assignment: 2024 polls: How people in high and low income areas voted in Chennai’s Mylapore, T Nagar and other areas Link: https://www.thehindu.com/data/2024-polls-how-people-in-high-and-low-income-areas-voted-in-chennais-mylapore-t-nagar-and-other-areas/article68427083.ece

1. Story and Data Understanding

Story Objective: Highlight how voting patterns in Chennai's Lok Sabha elections vary by income levels, with a specific focus on the DMK and BJP's vote shares.

Data Details: The dataset includes vote shares for the DMK and BJP across various areas in Chennai, categorized by income levels. This data is critical for analyzing the correlation between income levels and voting preferences.

2. Analysis and Visualization Plan

Original: From The Hindu data image

A. Overall Trends and State-Wise Breakdown Objective: : To illustrate the overall trends in vote shares of the DMK and BJP across different areas, with an emphasis on income levels.

Visualization 1: image This bar chart visualizes the vote shares of the DMK and BJP across various areas in Chennai, sorted by income levels (from high to low). The data illustrates the voting patterns, showing higher support for the BJP in wealthier areas and stronger support for the DMK in lower-income areas.

Visualization 2: image This line graph shows the overall vote shares of DMK and BJP across different areas, categorized by income levels. The graph illustrates that DMK's vote share tends to be higher in lower-income areas, while BJP's vote share is higher in wealthier areas.

B. Detailed Area-wise Analysis Objective: : To provide a detailed comparison of vote shares within specific areas, highlighting the socio-economic differences.

Visualization 1: image This heatmap illustrates the vote shares of the DMK and BJP across different areas in Chennai, sorted by income levels. The color gradient effectively highlights the areas of strong and weak support for each party, with darker shades indicating higher vote shares.

Visualization 2: image The scatter plot provided shows the distribution of vote shares for the DMK and BJP across various areas. The x-axis represents the DMK vote share percentage, while the y-axis represents the BJP vote share percentage. The data points are color-coded based on the income level of the areas, with blue dots representing high-income areas and orange crosses representing low-income areas. This scatter plot visually demonstrates the correlation between income levels and party support, highlighting socio-economic divisions in voting patterns. The data suggests that the DMK's support base is stronger in lower-income areas, whereas the BJP has a stronger presence in higher-income areas. This insight could be useful for understanding voter demographics and tailoring campaign strategies accordingly.

Summary of the Redesign Process:

Harsehraab commented 1 month ago

Name: Harsehraab Singh Sarao Roll Number: 21f1000507

Original Article : On unemployment in Indian States Link : https://www.thehindu.com/news/national/on-unemployment-in-indian-states/article68051708.ece

Story by the author: 1) The author analyses the extent of unemployment in major states. 2) Highlights the disparities in unemployment rates across different states. 3) Goa has the 3 times the unemployment rate than the national average, 4) Wealthy states like Kerala, Haryana, and Punjab seem to have higher unemployment rates. 5) Rich western states like Gujarat and Maharashtra seem to have lower rates of unemployment. 6) States with a higher proportion of self-employment have lower unemployment rates, 7) Urbanisation seems to increase the rate of unemployment due to fewer informal job opportunities. 8) Education also seems to play a role in increasing unemployment because well educated graduates want to work in adequately paying roles. The number of such job opportunities is scarce as compared to the number of candidates.

What data he/she is using to tell the story? 1) The article is utilising data that considers individuals aged 15 and above. 2) The data seems to be sourced from the Periodic Labour Force Survey (PLFS) of 2022-23.

Screenshot 2024-07-28 at 5 07 22 PM Screenshot 2024-07-28 at 5 08 01 PM Screenshot 2024-07-28 at 5 08 38 PM

Type of data: 1) Quantitative data for unemployment rates across states of India. 2) Quantitative data for self employment across the states of India 3) Quantitative data for the well educated individuals across the states of India

Extent of the data: 1) Data for the duration 2022-23. 2) Data from all states across India.

Gaps in the data: 1) No data for Union Territories. 2) No data for industries present in various states, states with higher numbers of industries would be able to provide more job opportunities.

What data is essential: 1) Unemployment rates 2) Self-employment rates 3) Educational levels.

Key Findings 1) States with well educated individuals seem to have a higher rate of unemployement. 2) Unemployment is less in areas where self employment is common practice. 3) The highest rate of unemployment in the country is 10% for the state of Goa. 4) Northern states seem to have a higher rate of unemployment; even wealthy states like Haryana, and Punjab also have high unemployment rates. 5) Informal job opportunities seem to be diminishing as urbanisation increases

Original Encoding: 1) Bar charts and line graphs to show unemployment rates and trends. 2) Scatter plots to depict the relationship between self-employment and unemployment.

Problems: 1) Lack of visual clarity in comparing states. 2) Limited use of colour to differentiate data points. 3) No interactive elements to explore data in depth.

Redesigning the Visualisation Improvement Goals: 1) Enhance visual clarity and comparison. 2) Use colour effectively to highlight key data points.

Screenshot 2024-07-28 at 8 45 42 PM

ata points.

SOORYAKIRAN-B commented 1 month ago

Name: SOORYAKIRAN B Roll No.: 21f1003835

Unemployment remains a concern in India post-pandemic

Story the Author is Trying to Tell

The author aims to highlight the persistent issue of unemployment in India, especially in the aftermath of the COVID-19 pandemic, by illustrating how various individuals are struggling with joblessness and how the Labour Force Participation Rate (LFPR) and Unemployment Rate (UR) have changed over time.

Data utilised

Table 1: Labour Force Participation Rate (LFPR)

Table1

The LFPR in India has shown a significant decline post-pandemic, indicating that fewer people are either working or seeking employment.

Encoding and Problems:

Redesign of Table 1

Table 1_ Labour Force Participation Rate (LFPR)

Insights

Table 1: Labour Force Participation Rate (LFPR)

Insights:

Table 2: Unemployment Rate (UR)

Table2

The unemployment rate in India has increased post-pandemic, with a noticeable difference between urban and rural areas and between genders.

Encoding and Problems:

Redesign of Table 2

Unemployment Rate (UR)

Insights

Table 2: Unemployment Rate (UR)

Insights:

Table 3: LFPR and UR by Quarter

Table3

Both LFPR and UR show quarterly trends over the years, with noticeable fluctuations around the pandemic period.

Encoding and Problems:

Redesign of Table 3

Tab3

Insights

Table 3: LFPR and UR for Quarters Ending in September

Insights:

Table 4: LFPR and UR by Month

Table4

Monthly trends in LFPR and UR, showing how these metrics change within a year.

Encoding and Problems:

Redesign of Table 4

Tab4

Insights from the story

Table 4: LFPR and UR for November Months

Insights:

Insights from Story

The data reveals a concerning trend of decreasing labour force participation and persistently high unemployment rates, especially among females and in urban areas. The impact of the COVID-19 pandemic worsen these issues, and the recovery appears to be slow, indicating a need for targeted employment initiatives.

Arshi81099 commented 1 month ago

Maoist Setbacks in Chhattisgarh, 2024

Story the Author is Trying to Tell The author is detailing the severe setbacks faced by Maoists in Chhattisgarh in 2024, highlighting that the insurgency is most intense in districts with poor development indicators. The report emphasizes the correlation between Maoist activity and underdeveloped regions, suggesting that socio-economic factors play a significant role in the insurgency.

Data Analysis

  1. Year-wise Deaths of Maoists in Chhattisgarh (Chart 1) Type: Quantitative Extent: Yearly data on Maoist deaths in Chhattisgarh Dimensions: Number of deaths

    Screenshot 2024-07-28 at 8 07 32 PM
  2. Deaths of Civilians, Security Forces, and Maoists Over the Years (Chart 2) Type: Quantitative Extent: Yearly data on deaths of civilians, security forces, and Maoists Dimensions: Number of deaths

    Screenshot 2024-07-28 at 8 12 06 PM
  3. District-wise Average of Maoist Deaths (Table 3) Type: Quantitative Extent: District-wise average of Maoist deaths every four years from 2001 to 2024 Dimensions: Number of deaths

    Screenshot 2024-07-28 at 8 09 08 PM
  4. District-wise Development and Welfare Indicators in Chhattisgarh (Table 4) Type: Quantitative and Qualitative Extent: District-wise development and welfare indicators Dimensions: Sanitation, Literacy, and other socio-economic indicators

    Screenshot 2024-07-28 at 8 09 14 PM

Gaps in the Data

  1. Lack of detailed analysis on the effectiveness of security operations and strategies used.
  2. No data on the socio-economic impact on the local population due to the insurgency.
  3. Limited information on the long-term trends in insurgency and development indicators.

Essential vs. Irrelevant Data Essential: Year-wise data on deaths of Maoists, civilians, and security forces; district-wise development indicators; district-wise average of Maoist deaths. Irrelevant: Detailed operational tactics used by security forces without context to broader trends. Redesigned Visualizations

1. Year-wise Deaths of Maoists in Chhattisgarh Original: Line chart showing yearly Maoist deaths. Redesign: This line chart displays the number of Maoist deaths in Chhattisgarh from 2001 to 2024. The visual highlights significant

Screenshot 2024-07-28 at 10 00 52 PM

years such as 2009 and 2024, showing peaks in casualties.

2. Deaths of Civilians, Security Forces, and Maoists Over the Years Original: Bar chart comparing deaths of civilians, security forces, and Maoists over the years. Redesign: This stacked bar chart compares the deaths of civilians, security forces, and Maoists from 2001 to 2024. It shows the decreasing trend in security force casualties and the varying trends in Maoist and civilian deaths.

Screenshot 2024-07-28 at 10 01 18 PM

3. District-wise Average of Maoist Deaths Original: Table listing district-wise average of Maoist deaths. Redesign: This heatmap shows the average number of Maoist deaths in Chhattisgarh districts every four years from 2001 to 2024. Districts with higher averages are highlighted to indicate hotspots of insurgency activity.

Screenshot 2024-07-28 at 10 02 29 PM

4. District-wise Development and Welfare Indicators Original: Table showing district-wise development indicators. Redesign: This combined bar and line chart visualizes district-wise development and welfare indicators such as sanitation and literacy rates. The chart compares these indicators with the average number of Maoist deaths to highlight the correlation between poor development and higher insurgency activity.

Screenshot 2024-07-28 at 10 03 21 PM

Summary of the Article: In 2024, Maoist insurgents in Chhattisgarh faced severe setbacks, with 141 of the 162 Maoist deaths in India occurring in the state. This marks the highest casualties since 2009's 'Operation Green Hunt.' The return of the Bharatiya Janata Party (BJP) to power in December 2023 coincides with this spike. Bijapur district saw the most clashes, resulting in 74 Maoist deaths. Despite these setbacks, the insurgency persists, particularly in poorly developed, forested areas. The data shows a correlation between intense insurgency and poor development indicators such as sanitation and literacy.

Final Redesign The redesigned visualizations aim to provide a clearer and more comprehensive understanding of the data, emphasizing the correlation between poor development indicators and higher Maoist activity. The use of line charts, stacked bar charts, heatmaps, and combined bar-line charts enhances the clarity and impact of the data, making it more accessible and insightful for the audience. By following these steps and documenting the design process, the redesign effectively communicates the key insights and trends related to the Maoist insurgency in Chhattisgarh, emphasizing the importance of development in mitigating insurgency.

Regards, Name: Arshi Khan Roll No: 21f3002806

SriNandhiniThiyagarajan commented 1 month ago

Analysis of Union Budget 2024: Sector-Wise Impact

Original Story Analysis:

The article aims to provide a detailed analysis of the impact of the Union Budget 2024 on various sectors of the economy. The story is centered on how different sectors are affected by the budgetary allocations and reforms proposed by the government. The author seeks to highlight which sectors have received increased funding, which have seen cuts, and the overall implications of these changes for the economic landscape.

Key Points of the Story:

image

Data Details:

Type of Data:

Extent of the Data:

Dimensions of the Data:

Gaps in the Data:

Essential Data:

Original Encoding Analysis:

Visualization: The original story likely uses bar charts or stacked bar charts to show the allocation across sectors, and perhaps line graphs to show changes over time.

Problems with Original Encoding:

image image image

Improvements:

Enhanced Visualizations:

Trend Analysis:

Impact Visualization:

Grouped Bar Chart The grouped bar chart will show the budget allocations for each sector side-by-side for the years 2022, 2023, and 2024. image

Stacked Bar Chart The stacked bar chart will show the budget allocations for each sector stacked on top of each other for the years 2022, 2023, and 2024. image

Heat Map: To show the intensity of funding changes across sectors. image

Pie Chart: To show the proportion of the total budget allocated to each sector for 2024. image

Conclusion:

The redesign of the Union Budget 2024 analysis enhances clarity and interpretability through improved visualizations such as grouped bar charts, stacked bar charts, heatmaps, and pie charts. These new visualizations provide clearer comparisons and trends across sectors, making the data more accessible and informative.

Regards, Name: SriNandhini T Roll No: 21f2001390

NikitaSharma1 commented 1 month ago

Name: Nikita Sharma Roll Number: 21f1000637

2024 polls: How people in high and low income areas voted in Chennai’s Mylapore, T Nagar and other areas

Objective:

The article explores the voting patterns in Chennai, focusing on how wealth/income levels affect party preferences, specifically for DMK and BJP. It analyzes voting data across different streets/areas with varying income levels.

Data Used

  1. Polling Stations and Areas:

    • Lists all polling stations and the corresponding areas that voted in those stations for the 2024 LS polls.
    • Example: Voters from Choolaimedu’s Rajeevgandhi Nagar voted in the primary school in Namachivayapuram.
  2. Form-20 Data:

    • Party-wise votes polled in each polling station.
    • Example: In a particular station, DMK secured 57% of 260 valid votes, BJP secured 12.7%.
  3. Guideline Value Data:

    • Reflects the minimum value at which a property can be registered, serving as a proxy for wealth/income.
    • Example: Guideline values range from ₹1,100/sq ft (low-income) to over ₹10,000/sq ft (high-income).

Visual Encoding in Original Visualization

image

Design Process and Improvement

  1. Understanding the Story:

    • The story highlights the correlation between income levels and voting patterns for DMK and BJP in Chennai.
    • The key message is the differing vote shares in areas of varying wealth.
  2. Data Description:

    • Type of Data: Quantitative (vote shares) and categorical (streets/areas with income levels).
    • Extent of Data: Covers multiple streets/areas within Chennai.
    • Dimensions: Income level (high to low), vote share for DMK and BJP.
    • Gaps: No explicit mention of change in the income of the areas within the visualisation.
    • Essential Data: Vote shares, guideline values of streets/areas.
    • Irrelevant Data: The range of the vote share for a area since it might confuse the person looking at the visualisation.
  3. Encoding Issues and Improvements:

    • Current Encoding:
      • Effective in showing vote shares but lacks clarity on income levels.
      • Legend is missing in visualization.
    • Improvements:
      • Use a line chart to incorporate the pattern properly.
      • Adding guideline value as the axis value instead of street names.
      • Include hover functionality for precise vote share values.

Redesign Proposal

  1. Visualization Type:
    • Dual-Axis Line Chart:
      • X-axis: Guideline values (income levels).
      • Y-axis : Vote share percentage.
      • Lines with different patterns (e.g., solid for DMK, dashed for BJP).
        1. Additional Features:
    • Hover Tooltips: Show exact vote shares and guideline values with the area name.
    • Voter Turnout: Include turnout percentage for more insight.

Iterations:

Step 1: Creating the base line chart with the vote share percentage.

image

Step 2: Adding gridlines and helpful insights

image

Step 3: Including hover tooltips and improving the layout

image

Conclusion

The redesign aims to provide a clearer and more accessible visualization of voting patterns based on income levels, enhancing the reader’s ability to understand the correlation and key insights.

21f1005544 commented 1 month ago

Name: John Joshi Alapatt Roll No: 21f1005544

Story Objective

The author aims to demonstrate how Lionel Messi's performance in the 2022 FIFA World Cup was exceptional, both compared to other players and his own previous performances in the 2014 and 2018 World Cups. The data shows his superiority in several key metrics, particularly highlighting his dual role as a playmaker and a striker.

Data Analysis

Chart 1 Shots Attempted vs. Shots on Target Percentage Data: Number of shots attempted and percentage of shots on target. Key Insight: Messi and Mbappé attempted the most shots, with Messi being more accurate.

chart1

Chart 2 Key Passes vs. Passes Completed into the 18-yard Box Data: Number of key passes and passes completed into the 18-yard box. Key Insight: Messi leads in both metrics, showcasing his playmaking abilities.

chart2

Chart 3 Touches in Attacking Third and Penalty Area vs. Successful Dribbles Data: Number of touches in the attacking third/penalty area and successful dribbles. Key Insight: Messi and Mbappé are significantly ahead in both metrics.

chart3

Chart 4 Radar Chart Comparison of Messi and Mbappé Data: Goals and assists, shots on target, key passes, passes into the 18-yard box. Key Insight: Messi’s balanced performance as both a midfielder and a forward compared to Mbappé.

chart4

Chart 5 Messi’s Performance Over Three World Cups Data: Goals per game, assists per game, shots on target per game, key passes per game. Key Insight: Messi's performance in 2022 is superior compared to 2014 and 2018.

chart5

Improvements

Chart 1 Goal: Highlight Messi and Mbappé, showing their shots attempted vs. shots on target percentage. Changes: Use distinct colors or markers for Messi and Mbappé, add data labels. Impr_chart1

Chart 2 Goal: Clearly show Messi's superiority in key passes and passes completed into the 18-yard box. Changes: Add data labels for key players, use color to differentiate players. Impr_chart2

Chart 4 Goal: Compare Messi and Mbappé's performances in multiple dimensions. Changes: The bar chart will allow a straightforward comparison of the individual metrics for each player side by side. alternative_chart4

Chart 5 Goal: Show Messi’s performance improvement over the three World Cups. Changes: Using a grouped bar chart and highlighting each metric in different colors allows easier comparison Impr_chart5

Conclusion

The redesigned charts helps in conveying the story clearly. The data underscores Messi's exceptional precision and scoring ability, reinforcing his status as one of the most proficient and reliable goal scorers in the sport.

vpleaides8 commented 1 month ago

Name: Kruttika Milind Soni Roll no.: 21f1001029

In 2024, Maoists suffer severe setbacks in Chhattisgarh

The aim of this article https://www.thehindu.com/data/in-2024-maoists-suffer-severe-setbacks-in-chhattisgarh/article68395649.ece was to show how the maoist movement in Chhattisgarh was stalled in 2024 as compared to previous years. Additionally, some impacts of the maoist insurgency were discussed by comparing more affected districts with less affected ones.

Chart 1

Type of data: Number of Maoist deaths Extent of data: yearly data from 2000 to 2024 Dimension of data: years on x axis and no.of deaths on y axis Gaps in data: Doesn’t show data about which operations led to the deaths.

Encoding

Number of deaths are represented clearly with a line with points signifying the values. Continuity is met. Problems: There are less gridlines which makes it hard to identify what the values are. Improvements: Maoist deaths can be represented in red as a form of semantic encoding. Screenshot 2024-07-28 at 3 55 18 PM

Chart 2

Type of data: Number of deaths of civilians, security forces, insurgents Extent of data: yearly data from 2000 to 2024 Dimension of data: years on x axis and no.of deaths on y axis. Gaps in data : Doesn’t show operations and attacks data Essential and irrelevant data: number of deaths is essential data. Civilian deaths is not referenced again so might be irrelevant overall as it only shows maoist activity.

Encoding

Red graph for maoist deaths is good semantic encoding. 3 different graphs show different magnitudes. Problems: multiple bar charts make it hard to compare the 3 categories Improvements: a grouped line graph can show the trend of increasing maoist deaths and decreasing security force casualties. The data for maoist deaths was represented for the second time here after Chart 1. Screenshot 2024-07-28 at 3 55 29 PM

Table 3

Type of data: average number of Maoist deaths Extent of data: geographical extent of Chhattisgarh and 4 yearly data from 2001 to 2024 Dimension of data: district wise data with four yearly measurements of average deaths. Gaps in data: Some districts had their names changed. Essential and irrelevant data: districts with 0 Maoist deaths overall are irrelevant.

Encoding

Red for important districts pulled attention to high deaths in some districts. The year wise data showed change in numbers over the years. Problems: It was hard to parse the difference between districts and deadliest years in the table format. Improvements: Heatmap to represent worst districts and years. Uses red semantic encoding for maoist deaths. Screenshot 2024-07-28 at 4 35 36 PM

Table 4

Type of data: percentage of people of a certain group falling in the particular developmental parameters. Extent of data: geographical extent of Chhattisgarh and its population. Dimension of data: District wise data for following parameters : Population using improved sanitation facility, women with over 9 years of schooling, stunted children under 5 years of age and if districts are impacted by maoists. Gaps in data: It is not clear what year this data is collected for, gaps arising due to changes in district names and boundaries. Essential and irrelevant data: Essential developmental parameters are checked like sanitation, education and nutrition.

Encoding

Not much encoding done here except red coloured cells showing worst affected districts. Red as a colour calls attention quickly on light backgrounds. Problems: The worst affected districts are at the top but the difference between maoist affected and limited impact districts is not visible. It is also not possible to tell where in the developmental parameters should higher percentages signify more development. Improvements: Bifurcation of maoist affected and limited impact districts. Visualisation of the developmental parameters for comparison between districts.

Screenshot 2024-07-28 at 4 35 44 PM

Redesign

Iteration 1

Decided Chart 1 was a good introduction to the article. Planned to convert Table 4 into a choropleth map visualisation with different maps for each parameter.

Iteration 2

Developed the heatmap for Table 3. Annotated and changed colour of chart 1.

Iteration 3

Scatter plot for Chart 4 with Districts represented by dots and sanitation and women's education factors used as axes. Size of point determined by child nutrition(stunting).

Untitled29_20240728230150

The final redesigned visualisations for the tables are:

Table 3

Districtwise avg Maoist Deaths

Table 4

DW development in Chhattisgarh Interactive visualisation: https://public.flourish.studio/visualisation/18885572/ This scatter plot shows the comparison between affected and limited impact districts where we can see the trend of affected districts performing worse that those with a limited impact.

Other data that can be used in this article includes:

  1. Data about maoist attacks and periods of high conflict
  2. Government interventions and operations to combat maoists in the area that led to maoist killings
  3. Maoist surrenders and other related politics which may lead to drop in maoist influence.
  4. Forested districts where maoist influence is higher.

Software used: Flourish, Excel, matplotlib, folium

Jigyasa2408 commented 1 month ago

Name - Jigyasa Roll No - 21f1001644

The Story - https://www.thehindu.com/data/diseases-with-higher-burden-in-asia-and-africa-lack-research-funding-data/article68319946.ece

  1. What is the story the author is trying to tell? The author wants to draw attention to the stark difference in research funding between neglected tropical diseases (NTDs) and illnesses like HIV/AIDS, TB, and malaria. The poorest populations are particularly affected by NTDs, which are highly prevalent in tropical and subtropical countries and receive relatively less money for research and development.

Key points :

  1. Funding Disparities: There is a significant funding gap between high-profile diseases (e.g., HIV/AIDS, tuberculosis, malaria) and neglected tropical diseases (NTDs), with NTDs receiving much less funding.
  2. Global Burden of NTDs: NTDs affect millions of people worldwide, primarily in poorer regions such as Asia and Africa, with India having the highest number of affected individuals.
  3. Underfunded Research: Despite the large number of people affected by NTDs, global research and development funding for these diseases is minimal compared to other diseases.
  4. Historical Funding Trends: Research funding has fluctuated over the years, with significant increases during events like the COVID-19 pandemic, but overall, NTDs remain underfunded.
  5. Call to Action: There is a need for increased funding and research to address the significant health burden imposed by NTDs and improve the lives of those affected.

The Data :

Type of data : Funding Data: Annual research & development funding for neglected tropical diseases, 2022. This data is expressed in US dollars, adjusted for inflation. Burden Data: Estimated number of people requiring treatment against neglected tropical diseases, 2021 Technological Focus: Distribution of global research funding across different technologies (vaccines, drugs, diagnostics, basic research) over time.

Extent of the data: Funding data for multiple diseases. Population data for countries requiring treatment. Historical funding trends from 2007 to 2022.

Dimensions of the data: Disease type. Amount of funding. Geographic distribution. Time series for technological funding.

Gaps in the data: No specific information on individual NTDs' funding distribution. Lack of detailed burden metrics beyond population numbers (e.g., mortality rates, disability-adjusted life years). Funding sources and allocations by different countries or organizations are not specified.

What data is essential? Disease type. Amount of funding.

Analyzing and Improving the Visual Encodings :

Original Visualizations Bar Chart: Showing annual research funding for various diseases in 2022. Map: Displaying the number of people requiring treatment for NTDs by country in 2021. Line Chart: Illustrating funding trends for different technological focuses from 2007 to 2022.

Problems and Improvements Bar Chart: Problem: The wide range of funding amounts creates a visual disparity that makes it hard to see funding for lesser-funded diseases. image Improvement: Use of logarithmic scale or annotations to better compare smaller funding amounts like shown below. Improved chart image

Map: Problem: Including markers or labels for countries with the highest burdens will help the viewers to interact better. Improvement: (https://ourworldindata.org/grapher/interventions-ntds-sdgs). This link here shows an interactive map which highlights the names of the countries while hovering over it. Improved map image

pranaydeep139 commented 1 month ago

Name: Sakiley Pranay Deep Roll Number: 21F1005603

Article: Which topics are India's researchers publishing papers on?

Story the author is trying to tell:

The author is trying to convey the story of the prevailing research trends in scientific and technological fields based on publications in the Web of Science database, and comparing India's most researched topics with those from other developed nations like USA and China through visualizations. The article showcases the global scientific community's focus on topics such as coronavirus, artificial intelligence, clean energy, and nanotechnology, and how different countries prioritize and contribute to research in these areas. The author also seems to be aiming to show how research trends can guide policy decisions and resource allocation, highlighting the importance of these researches in addressing global challenges and advancing technological progress.

Screenshot 2024-07-28 190123

Screenshot 2024-07-28 190216

Key insights from the data:

Research focus shift in India: There has been some shift in research focus in India in recent years (2019-2023) compared to the long-term (2004-2023). In recent years, there has been a focus on coronavirus research and nanotechnology (nano fluids and silver nanoparticles), whereas the long-term focus has been on nanotechnology again and wireless sensor networks.

Focus on Corona virus research: Corona virus research has been a global focus in recent years, as it is the most published topic in USA and India according to the data for 2019-2023. It is worth noting that Corona virus has not been in China's five most researched topics.

Deep Learning prominence: Deep learning is a rising area of research prominence in all three countries, with China having the highest number of publications in this field.

China's strength in material science: China has a consistent focus on material science, as evidenced by their high publication rates in photocatalysis and supercapacitors in both recent and long-term datasets.

USA's focus shift in recent years: The USA has also shown a shift in research focus in recent years, moving away from long-term focus areas like HIV, parenting, and galaxies to focus on coronavirus research and deep learning.

Data used to tell the story:

Type of data: This is quantitative data (ratio data), focusing on the volume of research paper outputs. It consists of the number of published research papers categorized by topics in India, the USA, and China.

Extent of the data: Temporal coverage: The data covers two time periods: 2019-2023 and 2004-2023. Geographic coverage: The data is specific to three countries: India, the USA, and China.

Dimensions of the data: Country: India, USA, China. Time Periods: 2019-2023, 2004-2023. Topics: Specific research topics under which the highest number of papers were published.

Gaps in the data: Limited scope of topics: Only the top five topics are listed for each country, which may not fully represent the diversity of research areas.

Essential data: Number of papers: The exact count of research papers published on each topic. Topic names: Specific areas of research focus.

Irrelevant data: The data used in this story is concise and specific. Hence there is no irrelevant data.

Analysis of the original encoding:

In each visualization, three bar charts have been placed next to each other for comparison (one representing each country) that show the count of research papers published in the top five research fields.

Problems with this encoding: The major problem with this visualization is inconsistent scaling across bar charts of different countries. For example, India's Corona virus that has 12629 publications has a bar of smaller length than USA's Gut microbiota that has 12435 publications, which can be misleading.

Improvement proposal for the redesign: This encoding can be redesigned in a better way by using a single bar graph with precise scaling for all three countries for each timeline, with different countries represented in different colours.

Screenshot 2024-07-28 232152

Screenshot 2024-07-28 232217

Thank you!

Sa-N98 commented 1 month ago

Name: Saranya Nayak Roll Number: 21f1005767

Which topics are India’s researchers publishing papers on?

Source: https://www.thehindu.com/data/which-topics-are-indias-researchers-publishing-papers-on/article68410121.ece

image image

What is the story the author is trying to tell?

The author highlights the research focus trends in India and globally over the last two decades, with a specific emphasis on the last five years. The story reveals that while coronavirus remains a predominant research topic worldwide, India's researchers are also significantly contributing to deep learning, photocatalysis, and nanotechnology. The article contrasts India's concentrated efforts in nanotechnology, partly driven by the Nano Mission, with China's focus on high-impact technological fields and the U.S.'s diverse research interests, particularly in health and social well-being

What data he/she is using to tell the story?

The author uses data from the Web of Science, a scholarly publication database, to analyze research trends over the last 20 years and the last five years. This data includes the number of published papers on various topics by researchers from different countries, allowing for a comparative study of the most researched topics globally and within specific nations such as India, the U.S., and China. The article also references specific research outputs and projects, like India's Nano Mission, to illustrate the focus areas and the volume of research in these fields.

What data he/she is using to tell the story? Describe its details -- type of data, extent of the data, dimensions of the data, gaps in the data, what data is essential and what is irrelevant.

Type of Data: The data used in the article primarily consists of bibliometric information from the Web of Science database. This includes:

  1. Publication Counts: Number of research papers published on specific topics.
  2. Research Topics: Specific subjects that are the focus of these papers, such as coronavirus, deep learning, photocatalysis, nanotechnology, etc.
  3. Geographical Information: Country-specific data indicating the research output from India, the U.S., China, and other selected nations. Extent of the Data: The extent of the data covers:
  4. Time Span: Research trends over the last 20 years and a focused look at the last five years.
  5. Countries: Comparative analysis of research output from multiple countries.
  6. Research Areas: Different scientific and technological fields, from health and AI to energy and materials science. Dimensions of the Data:
  7. Temporal Dimension: Distribution of research publications over time.
  8. Geographical Dimension: Distribution across different countries.
  9. Topical Dimension: Focus areas and specific research topics. Gaps in the Data:
  10. Detail on Methods: Lack of detailed methodological explanation on how the data was collected and analyzed.
  11. Incomplete Charts: Mention of potentially incomplete charts, affecting the clarity of the visual data representation.

Essential vs. Irrelevant Data:

Essential Data: • Number of publications on key research topics. • Country-specific research focus and output. • Trends over different time periods (last 20 years vs. last five years). Irrelevant Data: • Specific names of researchers and their affiliations (unless discussing the impact of individual contributions). • Photos or unrelated visuals that do not add to the understanding of research trends.

How is it encoded, what problems are with it, and how have you attempted to improve it?

Encoding: The data is primarily encoded in textual descriptions and charts. The textual data includes numerical counts and comparisons between different countries and topics. Dividing the data into two time frames was a good decision by the author as the last 5 year data are biased towards coronavirus.

Problems with the Data: While the data displayed in the original graph contains all the info, it failed to give a clear comparison about the amount of research being done in india and other countries and how popular each top topics are in other countries.

For example: in case of Deep learning one has to look at different parts of the graph to come to an conclusion. As deep learning positions are not in the same place in the graph. Can’t determine total amount of research being done in each country

Solution: Group similar topics and countries data and visualize using proper chart.

Design Iterations :

image

Iterations 1: Used Treemap to group the data and first by countries and then by topic. Benefits: Indicates clearly the proportion of how much research each countries are giving in different areas. Drawbacks: Hard to compare the Research topics of different countries.

image

Iterations 2 : Stacked bar chart does help in visualizing the and comparing the total number of research paper and total number of research don by topic Drawbacks: Hard to navigate topic color coding.

image

Iterations 3: Clustered Bar Chart helps in grouping the data in terms of country and topic. One can easily comapire the amount of effort given to different research areas in different countries. Drawbacks: This graph looses total amount of research being done in a country.

Final:

image

image

varunbalaji1303 commented 1 month ago

Name: Varun Balaji Roll No: 21f1005027

Title: "2024 polls: How people in high and low income areas voted in Chennai’s Mylapore, T Nagar and other areas" Link to story: https://www.thehindu.com/data/2024-polls-how-people-in-high-and-low-income-areas-voted-in-chennais-mylapore-t-nagar-and-other-areas/article68427083.ece

Objective:

The article examines voting patterns in Chennai's 2024 Lok Sabha elections, comparing high and low-income areas to understand if income levels influenced voting behaviour.

Main Points:

Analyzing the Data:

Types of Data:

Extent of Data:

Data Dimensions:

Gaps and Relevance:

Visual Encoding:

Current Encoding: Scatter plots showing vote shares (DMK in red, BJP in blue) across streets with varying guideline values.

Problems Identified:

Original Graph 1:

Screenshot 2024-07-28 at 10 48 56 PM

Original Graph 2:

Screenshot 2024-07-28 at 10 48 45 PM

Proposed Improvements:

Redesign Strategy:

Design Process Documentation:

- Step 1: Initial Analysis

  1. Review the original scatter plots.
  2. Note the distribution and spread of vote shares.

- Step 2: Simplifying Visualization

  1. Create a cleaner layout with concise labeling.
  2. Ensure the plot is easy to interpret at a glance.

- Step 3: Enhancing Accessibility

  1. Select a color palette accessible to color-blind individuals.
  2. Test different color schemes for effectiveness.

- Step 4: Adding Interactivity

  1. Utilize tools like Plotly or D3.js to add hover effects.
  2. Allow users to click on data points to reveal more details.

- Step 5: Including Additional Data

  1. Collect relevant demographic data.
  2. Use it to create multi-layered visualizations that provide deeper insights.

By following these steps, the redesigned visualization will maintain the original story's integrity while enhancing clarity and accessibility.

Redesigned Graph 1:

Screenshot 2024-07-28 at 11 38 13 PM

Redesigned Graph 2:

Screenshot 2024-07-28 at 11 38 42 PM

Here are the redesigned scatter plots for the voting patterns in Nungambakkam and Kodambakkam, and Adyar and Guindy:

  1. Nungambakkam and Kodambakkam Voting Patterns
  2. Adyar and Guindy Voting Patterns

Key Takeaways:

Insights:

Future Improvements:

To further enhance the data story, integrating interactive elements and additional datasets, such as demographic information, can provide deeper insights. Interactive visualizations using tools like Plotly or D3.js can offer users the ability to explore data points in more detail, fostering a more engaging and informative experience.

Overall, the redesign maintains the original story's intent while significantly improving accessibility and readability, making the data more meaningful and actionable for a wider audience.

miqbal07 commented 1 month ago

Name - Iqbal Hossain Roll no - 21f2000965

Title - Redesigning the Budget Allocation Story for Andhra Pradesh

Source link - https://www.thehindu.com/news/national/andhra-pradesh/centre-allocated-50475-crore-which-is-about-4-of-the-national-budget-to-andhra-pradesh-murugan/article68453074.ece

Key Points of the Original Story

- Significant Budget Allocation:

The Central Government has allocated approximately ₹50,475 crore to Andhra Pradesh for the fiscal year 2024-25, which is about 4% of the total national budget. Focus on Development Projects:

Major projects funded include the construction of the capital city Amaravati (₹15,000 crore) and the Polavaram project, indicating strategic priorities for the state's development. Support for Andhra Pradesh Post-Bifurcation:

The allocation is portrayed as essential support for Andhra Pradesh, which requires financial assistance due to challenges following the bifurcation from Telangana. Comparison with Other States:

Implicitly, the story emphasizes that Andhra Pradesh's allocation is significant compared to other states, highlighting the Central Government's focus on the state’s development needs.

Analysis of Original Data and Encoding

Details of the Data: Type of Data:

Quantitative Data: Budget allocations in crores for Andhra Pradesh and other states. Specific allocations for key projects within Andhra Pradesh. Categorical Data: Names of states and projects. Percentage Data: Share of Andhra Pradesh’s allocation as a percentage of the national budget. Extent of the Data:

The data includes budget figures for Andhra Pradesh and selected states, along with project-specific allocations within Andhra Pradesh. Dimensions of the Data:

State Allocations: Budget amounts allocated to Andhra Pradesh, Karnataka, Tamil Nadu, Kerala, and Telangana. Project Allocations: Breakdown of Andhra Pradesh’s allocation into key projects.

Gaps in the Data:

The story lacks a detailed breakdown of other states' allocations, limiting broader comparison. Historical data or trends over previous years are not included, which would provide additional context. Limited information on how these funds fit into the overall budgets of Andhra Pradesh or the other states. Original Encoding Analysis: Original Encoding: Textual Presentation: The data is primarily presented through descriptive text, with numerical values interspersed within paragraphs. Numeric Values: Figures for allocations and percentages are provided in text form, requiring readers to interpret and compare them manually. Identifying Problems with Original Encoding Problems with Original Encoding: Lack of Visual Clarity:

The story does not use any visual aids or charts, which makes it difficult for readers to quickly understand and compare budget allocations. Inefficient Comparison:

Text-based comparisons require readers to process multiple numbers mentally, making it hard to assess the relative importance of Andhra Pradesh’s allocation. Information Overload:

The text-heavy format may overwhelm readers, making it challenging to extract key insights and priorities from the data. No Visual Storytelling:

The absence of visual elements leads to a lack of narrative flow, which could guide readers through the data more effectively. Proposed Improvements Improvements: Incorporation of Visual Aids:

Introduce visualizations such as bar charts and pie charts to represent the data, enhancing readability and comprehension. Emphasis on Key Insights:

Use color coding and visual emphasis to highlight Andhra Pradesh’s budget allocation, making it stand out for easier comparison. Comparative Analysis:

Include comparative data for other states, allowing readers to see Andhra Pradesh’s allocation in a broader context. Focus on Project Allocation:

Visualize the distribution of funds among key projects within Andhra Pradesh, providing a clearer understanding of the state’s priorities.

Bar Chart: State Budget Allocations Purpose: Visualize the allocation amounts to different states, emphasizing Andhra Pradesh's significant share. image

Pie Chart: Project-wise Allocation in Andhra Pradesh Purpose: Display the breakdown of budget allocation among major projects within Andhra Pradesh.

image

Bar Chart: Percentage of National Budget Purpose: Highlight Andhra Pradesh's share of the national budget compared to other states.

image

Conclusion: By incorporating these visualizations, the redesigned story enhances comprehension and engagement, making the budget allocations more accessible and meaningful to the audience. The visual elements address the original story's shortcomings by providing clarity, facilitating comparisons, and highlighting key insights, thereby improving the overall narrative.

sujashaaa commented 1 month ago

Name: Sujasha S Roll: 21f3001115 Hindu IGD Datapoint for review Title: Wealth Tax-Financed Green Deal in Indian Budget 2024 Publisher : SHOUVIK CHAKRABORTY,ROHIT AZAD

Publisher’s Data Summary India's new NDA government is preparing to present its Budget 2024 (now done), focusing on critical issues like unemployment, climate change and inequality. A key proposal is an Indian Green Deal (IGD),financed entirely by the introduction of wealth tax, designed to address climate change, inequality, and joblessness. The wealthiest 10% of Indians, through their consumption of carbon-intensive goods, have significantly contributed to rising emissions and inequality. The IGD aims to prioritize green energy, infrastructure, and the care economy (education and health), inspired by the Atmanirbhar package from 2020. It proposes an investment of 10% of GDP over ten years: 5% on infrastructure, 3% on the care economy, and 2% on green energy. This initiative could generate 38.6 million jobs, accounting for 8.2% of the labor force. Funding the IGD would necessitate a wealth tax of around 1.7%, potentially decreasing to 1.3% by 2032 due to the anticipated increase in wealth among the Indian elite. India’s aim to be a bold leader in climate action will be visible with this strategy. Data & Charts Carbon Emission Data (Chart 1): Type: Per capita carbon emissions. Extent: 3 decades (1990 to 2020) Comparative analysis of the top 10% of the Indian population versus an average Indian and a first-world citizen. Dimensions: Demography (Indian elite (top 10%) vs normal Indian citizen vs developed countries citizen) Essential: Yes, to connect how wealth inequality boosts per capita emissions and how it destroys environment to justify the IGD green tax.

image

Expenditure Data (Chart 2 and 3 above): Type: Indian elite vs citizens’ Expense ratio across commodities and overall budget categories Extent: Extent is unclear but believe it’s 3 decades (1990 to 2020) Dimensions: Commodties, budget categories Essential: Yes, to illustrate the higher expenses leading to higher consumption, thus driving more carbon emissions, to justify the wealth tax. Irrelevant - Not clear how elites are consuming vs normal citizens in these categories Employment Creation Data (Chart 4 below changed to chart 5) When the author is trying to come up with data for expenditure and employment created (chart 3 and 4), they are using 2 different charts which makes it tougher to compare them easily. Charts don’t have units of measurement as well which makes it difficult to follow what the chart is about. To remove anomaly, I’m proposing to use a bubble chart which helps us combine both chart 3 and chart 4 into one chart with the size of the bubble indicating the size of each category. Larger the size, larger is the expenditure and employment opportunity.

image Projected Wealth and Tax Rate Data (Chart 6 and 7 below): Type: Projected wealth increase forecast for Indian elite which becomes a target category to introduce declining wealth tax rate to fund IGD. Extent: One decade (2023 to 2032) Dimensions: No dimensions by which wealth and tax rate metrics are broken out by Essential: Yes, to introduce wealth tax from increasing wealth and financing green deal and projecting how much could be attained

image

mnatasha1402 commented 1 month ago

Name: Natasha Mittal Roll no. 21f1005823 Analysis and Redesign of the Story: "Which topics are India’s researchers publishing papers on?"

Data source: https://www.thehindu.com/data/which-topics-are-indias-researchers-publishing-papers-on/article68410121.ece

Story Intent The author aims to inform readers about the research topics that Indian researchers are focusing on, comparing these trends with those of other countries like the U.S. and China. The story highlights the predominant research areas, especially in the context of global trends over the last 20 years and specifically in the last five years.

Data Description

  1. Type of Data:

    • Publication counts by research topics
    • Comparative data between different countries (India, U.S., China)
    • Time periods (last 20 years, last 5 years)
  2. Extent of the Data:

    • Covers two time periods: last 20 years and last 5 years
    • Includes multiple research topics
    • Compares three countries
  3. Dimensions of the Data:

    • Time (years)
    • Topics (e.g., Coronavirus, deep learning, photocatalysis)
    • Publication counts (number of papers published)
    • Geographic (countries: India, U.S., China)
  4. Gaps in the Data:

    • Detailed sources of the data (e.g., specific databases or institutions)
    • Breakdown of publication counts by specific sub-topics within broader categories
    • Qualitative insights on research impact or citations

    Visual Encoding

  5. Current Encoding:

    • Bar charts and line graphs are used to display publication counts by topics and comparisons between countries.

image

  1. Problems with Current Encoding:

    • Limited interactivity
    • Potential clutter with multiple topics and countries in a single chart
    • Lack of detailed explanations or annotations for the visual elements
  2. Areas for Improvement:

    • Interactivity to allow users to explore specific data points can be enhanced.
    • Simplify visualizations to reduce clutter and improve clarity.
    • Adding detailed explanations and annotations to guide the reader through the visualizations.

2. Improved Visual Encoding

Simplified Charts:

Final Thoughts

The redesign aims to improve user engagement, clarity, and depth of the original story by leveraging interactive visualizations and additional contextual data. This approach ensures that the main intent of showcasing research publication trends in India is maintained while providing a richer and more insightful user experience.

pranam-pagi commented 1 month ago

Name: Pranam Premanand Pagi Roll No: 21f3002964

Original Article: MPs 27 times wealthier than an average urban household

Authors: Vignesh Radhakrishnan, Sambavi Parthasarathy

Story of article in view of authors

The article highlights the wealth disparity between Members of Parliament (MPs) in India and the average urban household. It points out that MPs are significantly wealthier, with the majority possessing assets far above the typical urban or rural household. This wealth concentration suggests that election candidacies are often limited to affluent individuals.

Data Used:

Gaps in the Data:

The data focuses primarily on wealth, without contextual information about income sources, liabilities, or the potential impact of these wealth disparities on electoral outcomes.

Chart 1 | The chart shows the median assets of winners and runners-up in 2019 and 2024.

Chart 1

Chart 2 | The chart shows the median assets of candidates of the major political parties in 2024.

Chart 2

Chart 3 | The chart shows the average value of household assets for different decile classes for rural and urban areas in 2019 (in ₹ 1000s).

Chart 3

Analysis of the Original Visualization and Design Considerations

Original Visualization:

The original story uses multiple charts to illustrate the wealth of MPs compared to urban households. These include:

Problems Identified:

Improvement Suggestions:

Redesigning the Story

Objectives:

  1. Clarify the wealth disparity between MPs and average households.
  2. Highlight the distribution of wealth among MPs and across different political parties.
  3. Provide context by comparing with household wealth in different deciles.

Steps in Redesign:

  1. Data Collection and Cleaning: Gather detailed data on MP assets, including minimum, maximum, and median values, and the distribution of wealth within parties.

  2. Visualization Selection:

    • Use a box plot to display the range and distribution of assets among MPs.
    • Utilize a bar chart to compare median assets across political parties.
    • Implement a line or area chart to show the trend in wealth disparity over time.
  3. Design and Layout:

    • Ensure charts are clearly labeled and include explanations for median vs. average values.
    • Use consistent and accessible color schemes.
    • Include annotations to highlight significant data points or exceptions.
Fashmina123 commented 1 month ago

NAME: FASHMINA MOHAMED ROLL NO.: 21f3003099

Link: https://www.thehindu.com/data/2024-polls-how-people-in-high-and-low-income-areas-voted-in-chennais-mylapore-t-nagar-and-other-areas/article68427083.ece

The story: The article aims to highlight the voting patterns in Chennai's Lok Sabha elections, focusing on the differences in voting patterns between wealthier and lower-income areas. The story suggests that DMK has a stronger support in lower-income area, while BJP tends to perform better in wealthier neighborhoods.

Data Used: Type: It includes the voting percentages from polling stations across different areas in Chennai, categorised by a guideline value of properties in these areas. The guideline value acts as a proxy for wealth level of the residents. Data Extent: It covers 3 Lok Sabha constituencies in Chennai - North, Central and South. Dimensions: THe geographical areas, vote shares of DMK and BJP in various areas. Data Gaps: The data doesn't cover other political parties and focuses on these 2 parties.

Original Encoding: The visualisation includes a dot plot to represent the vote shares of the parties across different areas. vertical axis - streets ordered by wealth (descending) horizontal axis - vote share percentage red dots - DMK blue dots - BJP

image: image

Problems in the original encoding: The colour choice are standard but could have been more distinct, for colour blind viewers especially. The plot could have been clearer in separating the groups based on the income. Annotation could have been more comprehensive to enhance the understanding.

Redesign Proposal: Use distinct shapes or their logos to indicate the data better Introduce background shading to visually distinguish between high, medium and low income areas. add more annotations and a legend to explain the guideline values and income levels. Create an interactive version to see detailed vote share percentages and other relavant data.

The purpose of redesigning includes better accesibility to the data and for a better understanding.

DHIBIN-VIKASH commented 1 month ago

Name: Dhibin Vikash K P Roll No: 21f3001664

Article Tittle: Sikar, Namakkal, Kota: Select “coaching hubs” are host to many high scoring NEET-UG-2024 candidates Main Story: The article highlights cities with the highest candidates share, scoring 650+ and 700+ in NEET-UG 2024. Key cities include Sikar, Namakkal, Kottayam, Tanuku, Jhunjhunu, and Kurukshetra. Sikar leads with the highest percentages, while Namakkal stands out in Tamil Nadu due to its coaching institutes. Specific centers in these cities have high averages, with some centers showing anomalies.

Data Used:

Type of Data: Quantitative data on student scores from the NEET UG 2024 exams. Extent of Data: The dataset includes scores from all candidates who appeared for NEET UG 2024, focusing on those scoring 650 and above. Dimensions of the Data: The data includes candidate scores, cities, states, and specific educational centers. Gaps in the Data: The article needs to provide detailed demographic information or historical comparison data. Relevance: Essential data includes candidate scores and their respective cities and centers. Irrelevant data might include unrelated demographic details not covered in the story.

Current Visual Encoding:

Chart 1: A scatter chart displaying the percentage of students scoring above 650 marks across different cities. image

Table 2: A table listing the top centers with the highest share of candidates scoring above 650 marks. image

Problems with Current Encoding:

Scatter Chart: Cluttered Data Points: The scatter chart is very much disorganized, making it difficult to interpret what they are about to convey. While we hover over it conveys the percentage value which seems to be very redundant. Color Gradient: The color gradient from 0 to 7.48% might not be intuitive for quick interpretation. Table: Limited Information: The table lists the top ten centers but does not provide additional context or comparisons. Lack of Visual Appeal: The table needs to be more clear and could benefit from visual enhancements for better readability. Redesigning the Visualization

Improvement Plan:

Create more apparent, intuitive charts highlighting key insights without overwhelming the viewer. Use Effective Visual Elements: Use bar charts, heat maps, and annotated visualizations to emphasize the difference in marks scored across each state and showcase the district-wise split-ups.

Improve the existing tabular format with interactive charts that convey the information of the centers conducting the exam and the percentage of candidates scoring above 650.

Redesigned Visualizations:

Bar Chart: Displaying the top cities with the highest share of students scoring above 650 marks. Heat Map: Showing the concentration of high scores across different states. Annotated Visuals: Highlighting the top-performing centers and cities.

Data on the NEET scores of candidates was taken from the official websites and the below charts were prepared.

Redesigned Heat Map for state-wise distribution of scores >650:

image

Redesigned Bar chart showcasing the centers along with percentage of students scored above 650:

image

Documentation

Original Story:

Link to the original story: https://www.thehindu.com/data/neet-ug-2024-data-reveals-top-cities-for-high-scoring-candidates-crucial-for-government-medical-college-admissions/article68441411.ece

Redesign Documentation:

Map Visualization: Encoding: Geographical distribution of high-scoring candidates. Color Gradient: Represents % of candidates scoring above (650+).

Bar chart: Each bar represents the percentage of students who have scored above 650 across the top 5 centers in India. The length of the bar is encoded as a percentage, and the relevant numbers are given for further reference.