owid / etl

A compute graph for loading and transforming OWID's data
https://docs.owid.io/projects/etl
MIT License
58 stars 18 forks source link

stacked bar negative values - data viz work 💄 #2860

Closed veronikasamborska1994 closed 1 week ago

owidbot commented 1 week ago
Quick links (staging server): Site Admin Wizard

Login: ssh owid@staging-site-stacked-bar-negative-values

chart-diff: ❌
  • 0/5 reviewed charts
    • Modified: 0/0
    • New: 0/5
data-diff: ❌ Found differences ```diff = Dataset garden/artificial_intelligence/2024-06-19/epoch_compute_intensive_domain = Table epoch_compute_intensive_domain ~ Column cumulative_count (changed metadata) + + Recommendation systems offer suggestions based on user preferences, prominently seen in online shopping and media streaming. For instance, Netflix's movie suggestions or Amazon's product recommendations are powered by algorithms that analyze users' preferences and past behaviors. + + - |- + + 3D Modeling systems specialize in creating and manipulating 3D representations of objects, used in fields like architecture, engineering, and entertainment. + + - |- + + Audio systems process and generate sound, with applications in music composition, signal processing, and sound recognition. + + - |- + + Driving systems focus on autonomous vehicle technology, enabling cars to navigate and operate without human intervention. + + - |- + + Earth science systems utilize AI to study and model natural phenomena, assisting in weather forecasting and climate change studies. + + - |- + + - |- + + Materials science systems apply AI to discover and design new materials with specific properties, speeding up material discovery. + + - |- + + Mathematics systems solve complex mathematical problems and perform symbolic calculations, aiding in theorem proving and optimization. + + - Medicine systems enhance healthcare by improving diagnostics and treatment planning. + + - Robotics systems combine AI with mechanical engineering to create autonomous robots for various industries. + + - Search systems enhance search accuracy and relevance on the internet or within databases. + + - Video systems analyze and generate video content, aiding in editing, surveillance, and content creation. ~ Column yearly_count (changed metadata) + + Recommendation systems offer suggestions based on user preferences, prominently seen in online shopping and media streaming. For instance, Netflix's movie suggestions or Amazon's product recommendations are powered by algorithms that analyze users' preferences and past behaviors. + + - |- + + 3D Modeling systems specialize in creating and manipulating 3D representations of objects, used in fields like architecture, engineering, and entertainment. + + - |- + + Audio systems process and generate sound, with applications in music composition, signal processing, and sound recognition. + + - |- + + Driving systems focus on autonomous vehicle technology, enabling cars to navigate and operate without human intervention. + + - |- + + Earth science systems utilize AI to study and model natural phenomena, assisting in weather forecasting and climate change studies. + + - |- + + - |- + + Materials science systems apply AI to discover and design new materials with specific properties, speeding up material discovery. + + - |- + + Mathematics systems solve complex mathematical problems and perform symbolic calculations, aiding in theorem proving and optimization. + + - Medicine systems enhance healthcare by improving diagnostics and treatment planning. + + - Robotics systems combine AI with mechanical engineering to create autonomous robots for various industries. + + - Search systems enhance search accuracy and relevance on the internet or within databases. + + - Video systems analyze and generate video content, aiding in editing, surveillance, and content creation. = Dataset garden/happiness/2023-03-20/happiness = Table happiness = Dataset garden/news/2024-05-08/guardian_mentions = Table guardian_mentions ~ Dim country + + New values: 32 / 2739 (1.17%) year country 2018 Saint Martin 2018 Sint Maarten 2019 Sint Maarten 2016 Timor-Lest 2022 Timor-Lest - - Removed values: 14 / 2739 (0.51%) year country 2018 Saint Martin (French part) 2020 Saint Martin (French part) 2022 Saint Martin (French part) 2017 Sint Maarten (Dutch part) 2020 Sint Maarten (Dutch part) ~ Dim year + + New values: 32 / 2739 (1.17%) country year Saint Martin 2018 Sint Maarten 2018 Sint Maarten 2019 Timor-Lest 2016 Timor-Lest 2022 - - Removed values: 14 / 2739 (0.51%) country year Saint Martin (French part) 2018 Saint Martin (French part) 2020 Saint Martin (French part) 2022 Sint Maarten (Dutch part) 2017 Sint Maarten (Dutch part) 2020 ~ Column num_pages_mentions (new data, changed data) + + New values: 32 / 2739 (1.17%) country year num_pages_mentions Saint Martin 2018 20 Sint Maarten 2018 6 Sint Maarten 2019 1 Timor-Lest 2016 49 Timor-Lest 2022 114 - - Removed values: 14 / 2739 (0.51%) country year num_pages_mentions Saint Martin (French part) 2018 20 Saint Martin (French part) 2020 43 Saint Martin (French part) 2022 29 Sint Maarten (Dutch part) 2017 21 Sint Maarten (Dutch part) 2020 5 ~ Changed values: 18 / 2739 (0.66%) country year num_pages_mentions - num_pages_mentions + East Timor 2017 43 East Timor 2022 114 Saint Martin (French part) 2023 35 United States Virgin Islands 2016 27 United States Virgin Islands 2020 21 ~ Column num_pages_mentions_per_million (changed metadata, new data, changed data) + + {} - - title: Number of pages in the Guardian that mention a country (per million people) - - description_short: Number of pages in the Guardian that mention a particular country, normalised by the population of the - - country. - - origins: - - - producer: The Guardian - - title: Attention to each country in The Guardian's articles (raw mentions) - - description: |- - - Aggregate estimates on the number of entries that talk about each country and year. - - - - The data was obtained by querying The Guardian's Open Platform. - - - - An entry or page in The Guardian is considered to be about a certain country if that particular country is mentioned in the text. To this end, we have used a set of country name variations to ensure that we capture all the entries. Nonetheless, this is not a perfect method and some entries might be missed. - - citation_full: The Guardian, Open Platform - - url_main: https://open-platform.theguardian.com/access/ - - date_accessed: '2024-05-07' - - date_published: '2024' - - license: - - name: The Guardian terms of service - - url: https://www.theguardian.com/help/terms-of-service - - - producer: Various sources - - title: Population - - description: |- - - Our World in Data builds and maintains a long-run dataset on population by country, region, and for the world, based on various sources. - - - - You can find more information on these sources and how our time series is constructed on this page: https://ourworldindata.org/population-sources - - citation_full: |- - - The long-run data on population is based on various sources, described on this page: https://ourworldindata.org/population-sources - - attribution: Population based on various sources (2023) - - attribution_short: Population - - url_main: https://ourworldindata.org/population-sources - - date_accessed: '2023-03-31' - - date_published: '2023-03-31' - - license: - - name: CC BY 4.0 - - licenses: - - - name: Creative Commons BY 4.0 - - url: https://docs.google.com/document/d/1-RmthhS2EPMK_HIpnPctcXpB0n7ADSWnXa5Hb3PxNq4/edit?usp=sharing - - - name: CC BY 3.0 - - url: https://dataportaal.pbl.nl/downloads/HYDE/HYDE3.2/readme_release_HYDE3.2.1.txt - - - name: CC BY 3.0 IGO - - url: http://creativecommons.org/licenses/by/3.0/igo/ - - unit: pages per million people - - processing_level: major - - presentation: - - topic_tags: - - - Uncategorized - - description_processing: |- - - Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first defining a set of country name variations for each country, and then look for content on The Guardian with an explicit mention to these names. - - - - - - 1. Get all country name variations: - - - Obtain all the country name variations using our standard name list. - - - Our list may not cover all cases, and may contain some names that are not valid on The Guardian API (e.g. names with symbols like ';' are not supported). Therefore, we clean this list. - - - - 2. For each country, obtain the number of pages using each set of name variations. Steps: - - - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?q=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`. - - - - For mor details, please refer to the snapshot script. + + New values: 32 / 2739 (1.17%) country year num_pages_mentions_per_million Saint Martin 2018 NaN Sint Maarten 2018 NaN Sint Maarten 2019 NaN Timor-Lest 2016 NaN Timor-Lest 2022 NaN - - Removed values: 14 / 2739 (0.51%) country year num_pages_mentions_per_million Saint Martin (French part) 2018 590.336182 Saint Martin (French part) 2020 1319.949707 Saint Martin (French part) 2022 911.491089 Sint Maarten (Dutch part) 2017 501.181366 Sint Maarten (Dutch part) 2020 114.579033 ~ Changed values: 2551 / 2739 (93.14%) country year num_pages_mentions_per_million - num_pages_mentions_per_million + Cameroon 2022 10.173908 NaN Curacao 2014 17.853529 NaN Ecuador 2023 11.379571 NaN Latvia 2017 56.269985 NaN Tokelau 2016 1382.170044 NaN ~ Column num_pages_mentions_relative (changed metadata, new data, changed data) - - {} + + title: Share of pages in The Guardian that mention a country + + description_short: Share of pages in The Guardian that that mention a particular country. + + origins: + + - producer: The Guardian + + title: Attention to each country in The Guardian's articles (raw mentions) + + description: |- + + Aggregate estimates on the number of entries that talk about each country and year. + + + + The data was obtained by querying The Guardian's Open Platform. + + + + An entry or page in The Guardian is considered to be about a certain country if that particular country is mentioned in the text. To this end, we have used a set of country name variations to ensure that we capture all the entries. Nonetheless, this is not a perfect method and some entries might be missed. + + citation_full: The Guardian, Open Platform + + url_main: https://open-platform.theguardian.com/access/ + + date_accessed: '2024-05-07' + + date_published: '2024' + + license: + + name: The Guardian terms of service + + url: https://www.theguardian.com/help/terms-of-service + + unit: pages per 100,000 pages + + presentation: + + topic_tags: + + - Uncategorized + + description_processing: |- + + Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first defining a set of country name variations for each country, and then look for content on The Guardian with an explicit mention to these names. + + + + + + 1. Get all country name variations: + + - Obtain all the country name variations using our standard name list. + + - Our list may not cover all cases, and may contain some names that are not valid on The Guardian API (e.g. names with symbols like ';' are not supported). Therefore, we clean this list. + + + + 2. For each country, obtain the number of pages using each set of name variations. Steps: + + - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?q=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`. + + + + For mor details, please refer to the snapshot script. + + New values: 32 / 2739 (1.17%) country year num_pages_mentions_relative Saint Martin 2018 11.948002 Sint Maarten 2018 3.584401 Sint Maarten 2019 0.607371 Timor-Lest 2016 22.733494 Timor-Lest 2022 58.200977 - - Removed values: 14 / 2739 (0.51%) country year num_pages_mentions_relative Saint Martin (French part) 2018 NaN Saint Martin (French part) 2020 NaN Saint Martin (French part) 2022 NaN Sint Maarten (Dutch part) 2017 NaN Sint Maarten (Dutch part) 2020 NaN ~ Changed values: 2643 / 2739 (96.50%) country year num_pages_mentions_relative - num_pages_mentions_relative + Eswatini 2020 NaN 11.738843 Libya 2017 NaN 382.171539 Mayotte 2014 NaN 3.108569 Saudi Arabia 2018 NaN 632.049316 Yemen 2022 NaN 118.444092 ~ Column num_pages_tags (new data) + + New values: 32 / 2739 (1.17%) country year num_pages_tags Saint Martin 2018 Sint Maarten 2018 Sint Maarten 2019 Timor-Lest 2016 Timor-Lest 2022 - - Removed values: 14 / 2739 (0.51%) country year num_pages_tags Saint Martin (French part) 2018 Saint Martin (French part) 2020 Saint Martin (French part) 2022 Sint Maarten (Dutch part) 2017 Sint Maarten (Dutch part) 2020 ~ Column num_pages_tags_per_million (changed metadata, new data, changed data) + + {} - - title: Number of pages in the Guardian with a country tag (per million people) - - description_short: |- - - Number of pages in the Guardian that are tagged with a country-related label, normalised by the population of the country. - - origins: - - - producer: The Guardian - - title: Attention to each country in The Guardian's articles (tags) - - description: |- - - Aggregate estimates on the number of entries that talk about each country and year. - - - - The data was obtained by querying The Guardian's Open Platform. - - - - An entry or page in The Guardian is considered to be about a certain country if that particular country if it is tagged with a country-related label. To this end, we have used a set of tags for each country. Nonetheless, this is not a perfect method and some entries might be missed. - - citation_full: The Guardian, Open Platform - - url_main: https://open-platform.theguardian.com/access/ - - date_accessed: '2024-05-07' - - date_published: '2024' - - license: - - name: The Guardian terms of service - - url: https://www.theguardian.com/help/terms-of-service - - - producer: Various sources - - title: Population - - description: |- - - Our World in Data builds and maintains a long-run dataset on population by country, region, and for the world, based on various sources. - - - - You can find more information on these sources and how our time series is constructed on this page: https://ourworldindata.org/population-sources - - citation_full: |- - - The long-run data on population is based on various sources, described on this page: https://ourworldindata.org/population-sources - - attribution: Population based on various sources (2023) - - attribution_short: Population - - url_main: https://ourworldindata.org/population-sources - - date_accessed: '2023-03-31' - - date_published: '2023-03-31' - - license: - - name: CC BY 4.0 - - licenses: - - - name: Creative Commons BY 4.0 - - url: https://docs.google.com/document/d/1-RmthhS2EPMK_HIpnPctcXpB0n7ADSWnXa5Hb3PxNq4/edit?usp=sharing - - - name: CC BY 3.0 - - url: https://dataportaal.pbl.nl/downloads/HYDE/HYDE3.2/readme_release_HYDE3.2.1.txt - - - name: CC BY 3.0 IGO - - url: http://creativecommons.org/licenses/by/3.0/igo/ - - unit: pages per million people - - processing_level: major - - presentation: - - topic_tags: - - - Uncategorized - - description_processing: |- - - Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first getting all the tags for a country, and then getting the number of articles that have those tags. - - - - - - 1. Obtain all tags that concern a country: - - - Obtain all the tag pages that have a title starting with a country name: a query like "https://content.guardianapis.com/tags?web-title=spain", for Spain. As a result we obtain a mapping that tells us for each country the list of tags (e.g. "Spain: ") in use. - - - We work with a list of ~240 countries. - - - Getting the right country names has been an iterative process, trying to align our standard country names with the Guardian's. - - - - 2. For each country, obtain the number of pages using each set of tags. Steps: - - - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?tags=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`. - - - - For mor details, please refer to the snapshot script. + + New values: 32 / 2739 (1.17%) country year num_pages_tags_per_million Saint Martin 2018 NaN Sint Maarten 2018 NaN Sint Maarten 2019 NaN Timor-Lest 2016 NaN Timor-Lest 2022 NaN - - Removed values: 14 / 2739 (0.51%) country year num_pages_tags_per_million Saint Martin (French part) 2018 NaN Saint Martin (French part) 2020 NaN Saint Martin (French part) 2022 NaN Sint Maarten (Dutch part) 2017 NaN Sint Maarten (Dutch part) 2020 NaN ~ Changed values: 2547 / 2739 (92.99%) country year num_pages_tags_per_million - num_pages_tags_per_million + Canada 2014 8.840655 NaN Gibraltar 2018 826.269226 NaN Japan 2016 3.078889 NaN Latvia 2016 6.080638 NaN Tokelau 2013 0.000000 NaN ~ Column num_pages_tags_relative (changed metadata, new data, changed data) - - {} + + title: Share of pages in the Guardian with a country tag + + description_short: Share of pages in The Guardian that are tagged with a country-related label. + + origins: + + - producer: The Guardian + + title: Attention to each country in The Guardian's articles (tags) + + description: |- + + Aggregate estimates on the number of entries that talk about each country and year. + + + + The data was obtained by querying The Guardian's Open Platform. + + + + An entry or page in The Guardian is considered to be about a certain country if that particular country if it is tagged with a country-related label. To this end, we have used a set of tags for each country. Nonetheless, this is not a perfect method and some entries might be missed. + + citation_full: The Guardian, Open Platform + + url_main: https://open-platform.theguardian.com/access/ + + date_accessed: '2024-05-07' + + date_published: '2024' + + license: + + name: The Guardian terms of service + + url: https://www.theguardian.com/help/terms-of-service + + unit: pages per 100,000 pages + + presentation: + + topic_tags: + + - Uncategorized + + description_processing: |- + + Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first getting all the tags for a country, and then getting the number of articles that have those tags. + + + + + + 1. Obtain all tags that concern a country: + + - Obtain all the tag pages that have a title starting with a country name: a query like "https://content.guardianapis.com/tags?web-title=spain", for Spain. As a result we obtain a mapping that tells us for each country the list of tags (e.g. "Spain: ") in use. + + - We work with a list of ~240 countries. + + - Getting the right country names has been an iterative process, trying to align our standard country names with the Guardian's. + + + + 2. For each country, obtain the number of pages using each set of tags. Steps: + + - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?tags=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`. + + + + For mor details, please refer to the snapshot script. + + New values: 32 / 2739 (1.17%) country year num_pages_tags_relative Saint Martin 2018 NaN Sint Maarten 2018 NaN Sint Maarten 2019 NaN Timor-Lest 2016 NaN Timor-Lest 2022 NaN - - Removed values: 14 / 2739 (0.51%) country year num_pages_tags_relative Saint Martin (French part) 2018 NaN Saint Martin (French part) 2020 NaN Saint Martin (French part) 2022 NaN Sint Maarten (Dutch part) 2017 NaN Sint Maarten (Dutch part) 2020 NaN ~ Changed values: 2634 / 2739 (96.17%) country year num_pages_tags_relative - num_pages_tags_relative + Argentina 2013 NaN 226.496490 Aruba 2016 NaN 0.619191 Central African Republic 2017 NaN 10.622900 Congo 2018 NaN 57.331131 Taiwan 2020 NaN 50.219105 ~ Column relative_pages_mentions (changed metadata, new data, changed data) + + {} - - title: Share of pages in The Guardian that mention a country - - description_short: Share of pages in The Guardian that that mention a particular country. - - origins: - - - producer: The Guardian - - title: Attention to each country in The Guardian's articles (raw mentions) - - description: |- - - Aggregate estimates on the number of entries that talk about each country and year. - - - - The data was obtained by querying The Guardian's Open Platform. - - - - An entry or page in The Guardian is considered to be about a certain country if that particular country is mentioned in the text. To this end, we have used a set of country name variations to ensure that we capture all the entries. Nonetheless, this is not a perfect method and some entries might be missed. - - citation_full: The Guardian, Open Platform - - url_main: https://open-platform.theguardian.com/access/ - - date_accessed: '2024-05-07' - - date_published: '2024' - - license: - - name: The Guardian terms of service - - url: https://www.theguardian.com/help/terms-of-service - - unit: pages per 100,000 pages - - presentation: - - topic_tags: - - - Uncategorized - - description_processing: |- - - Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first defining a set of country name variations for each country, and then look for content on The Guardian with an explicit mention to these names. - - - - - - 1. Get all country name variations: - - - Obtain all the country name variations using our standard name list. - - - Our list may not cover all cases, and may contain some names that are not valid on The Guardian API (e.g. names with symbols like ';' are not supported). Therefore, we clean this list. - - - - 2. For each country, obtain the number of pages using each set of name variations. Steps: - - - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?q=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`. - - - - For mor details, please refer to the snapshot script. + + New values: 32 / 2739 (1.17%) country year relative_pages_mentions Saint Martin 2018 NaN Sint Maarten 2018 NaN Sint Maarten 2019 NaN Timor-Lest 2016 NaN Timor-Lest 2022 NaN - - Removed values: 14 / 2739 (0.51%) country year relative_pages_mentions Saint Martin (French part) 2018 11.948002 Saint Martin (French part) 2020 22.944101 Saint Martin (French part) 2022 14.805511 Sint Maarten (Dutch part) 2017 12.196963 Sint Maarten (Dutch part) 2020 2.667919 ~ Changed values: 2661 / 2739 (97.15%) country year relative_pages_mentions - relative_pages_mentions + Bahamas 2015 45.872166 NaN El Salvador 2017 56.919163 NaN Guernsey 2022 28.079418 NaN Liechtenstein 2021 40.499123 NaN Saudi Arabia 2023 686.867859 NaN ~ Column relative_pages_mentions_excluded (changed metadata, new data, changed data) + + {} - - title: Share of pages in The Guardian that mention a country (excludes UK, US, Australia) - - description_short: Share of pages in The Guardian that are tagged with a country-related label. Excludes US, UK and Australia. - - origins: - - - producer: The Guardian - - title: Attention to each country in The Guardian's articles (raw mentions) - - description: |- - - Aggregate estimates on the number of entries that talk about each country and year. - - - - The data was obtained by querying The Guardian's Open Platform. - - - - An entry or page in The Guardian is considered to be about a certain country if that particular country is mentioned in the text. To this end, we have used a set of country name variations to ensure that we capture all the entries. Nonetheless, this is not a perfect method and some entries might be missed. - - citation_full: The Guardian, Open Platform - - url_main: https://open-platform.theguardian.com/access/ - - date_accessed: '2024-05-07' - - date_published: '2024' - - license: - - name: The Guardian terms of service - - url: https://www.theguardian.com/help/terms-of-service - - unit: pages per 100,000 pages - - presentation: - - topic_tags: - - - Uncategorized - - description_processing: |- - - Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first defining a set of country name variations for each country, and then look for content on The Guardian with an explicit mention to these names. - - - - - - 1. Get all country name variations: - - - Obtain all the country name variations using our standard name list. - - - Our list may not cover all cases, and may contain some names that are not valid on The Guardian API (e.g. names with symbols like ';' are not supported). Therefore, we clean this list. - - - - 2. For each country, obtain the number of pages using each set of name variations. Steps: - - - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?q=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`. - - - - For mor details, please refer to the snapshot script. - - - - This estimates exclude the UK, US, and Australia from the total number of pages. The reason for this is because the Guardian is a UK-based newspaper, and it is expected to have a higher number of articles about the UK, US, and Australia. + + New values: 32 / 2739 (1.17%) country year relative_pages_mentions_excluded Saint Martin 2018 NaN Sint Maarten 2018 NaN Sint Maarten 2019 NaN Timor-Lest 2016 NaN Timor-Lest 2022 NaN - - Removed values: 14 / 2739 (0.51%) country year relative_pages_mentions_excluded Saint Martin (French part) 2018 20.384451 Saint Martin (French part) 2020 39.068180 Saint Martin (French part) 2022 24.335392 Sint Maarten (Dutch part) 2017 21.586283 Sint Maarten (Dutch part) 2020 4.542811 ~ Changed values: 2628 / 2739 (95.95%) country year relative_pages_mentions_excluded - relative_pages_mentions_excluded + Bouvet Island 2019 0.000000 NaN Jamaica 2017 328.933838 NaN Rwanda 2019 151.658157 NaN Saudi Arabia 2023 1138.211426 NaN Vatican 2013 450.137909 NaN ~ Column relative_pages_tags (changed metadata, new data, changed data) + + {} - - title: Share of pages in the Guardian with a country tag - - description_short: Share of pages in The Guardian that are tagged with a country-related label. - - origins: - - - producer: The Guardian - - title: Attention to each country in The Guardian's articles (tags) - - description: |- - - Aggregate estimates on the number of entries that talk about each country and year. - - - - The data was obtained by querying The Guardian's Open Platform. - - - - An entry or page in The Guardian is considered to be about a certain country if that particular country if it is tagged with a country-related label. To this end, we have used a set of tags for each country. Nonetheless, this is not a perfect method and some entries might be missed. - - citation_full: The Guardian, Open Platform - - url_main: https://open-platform.theguardian.com/access/ - - date_accessed: '2024-05-07' - - date_published: '2024' - - license: - - name: The Guardian terms of service - - url: https://www.theguardian.com/help/terms-of-service - - unit: pages per 100,000 pages - - presentation: - - topic_tags: - - - Uncategorized - - description_processing: |- - - Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first getting all the tags for a country, and then getting the number of articles that have those tags. - - - - - - 1. Obtain all tags that concern a country: - - - Obtain all the tag pages that have a title starting with a country name: a query like "https://content.guardianapis.com/tags?web-title=spain", for Spain. As a result we obtain a mapping that tells us for each country the list of tags (e.g. "Spain: ") in use. - - - We work with a list of ~240 countries. - - - Getting the right country names has been an iterative process, trying to align our standard country names with the Guardian's. - - - - 2. For each country, obtain the number of pages using each set of tags. Steps: - - - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?tags=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`. - - - - For mor details, please refer to the snapshot script. + + New values: 32 / 2739 (1.17%) country year relative_pages_tags Saint Martin 2018 NaN Sint Maarten 2018 NaN Sint Maarten 2019 NaN Timor-Lest 2016 NaN Timor-Lest 2022 NaN - - Removed values: 14 / 2739 (0.51%) country year relative_pages_tags Saint Martin (French part) 2018 NaN Saint Martin (French part) 2020 NaN Saint Martin (French part) 2022 NaN Sint Maarten (Dutch part) 2017 NaN Sint Maarten (Dutch part) 2020 NaN ~ Changed values: 2634 / 2739 (96.17%) country year relative_pages_tags - relative_pages_tags + Argentina 2013 226.496490 NaN Aruba 2016 0.619191 NaN Central African Republic 2017 10.622900 NaN Congo 2018 57.331131 NaN Taiwan 2020 50.219105 NaN ~ Column relative_pages_tags_excluded (changed metadata, new data, changed data) + + {} - - title: Share of pages in the Guardian with a country tag (excludes UK, US, Australia) - - description_short: Share of pages in The Guardian that are tagged with a country-related label. Excludes US, UK and Australia. - - origins: - - - producer: The Guardian - - title: Attention to each country in The Guardian's articles (tags) - - description: |- - - Aggregate estimates on the number of entries that talk about each country and year. - - - - The data was obtained by querying The Guardian's Open Platform. - - - - An entry or page in The Guardian is considered to be about a certain country if that particular country if it is tagged with a country-related label. To this end, we have used a set of tags for each country. Nonetheless, this is not a perfect method and some entries might be missed. - - citation_full: The Guardian, Open Platform - - url_main: https://open-platform.theguardian.com/access/ - - date_accessed: '2024-05-07' - - date_published: '2024' - - license: - - name: The Guardian terms of service - - url: https://www.theguardian.com/help/terms-of-service - - unit: pages per 100,000 pages - - presentation: - - topic_tags: - - - Uncategorized - - description_processing: |- - - Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first getting all the tags for a country, and then getting the number of articles that have those tags. - - - - - - 1. Obtain all tags that concern a country: - - - Obtain all the tag pages that have a title starting with a country name: a query like "https://content.guardianapis.com/tags?web-title=spain", for Spain. As a result we obtain a mapping that tells us for each country the list of tags (e.g. "Spain: ") in use. - - - We work with a list of ~240 countries. - - - Getting the right country names has been an iterative process, trying to align our standard country names with the Guardian's. - - - - 2. For each country, obtain the number of pages using each set of tags. Steps: - - - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?tags=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`. - - - - For mor details, please refer to the snapshot script. - - - - This estimates exclude the UK, US, and Australia from the total number of pages. The reason for this is because the Guardian is a UK-based newspaper, and it is expected to have a higher number of articles about the UK, US, and Australia. + + New values: 32 / 2739 (1.17%) country year relative_pages_tags_excluded Saint Martin 2018 NaN Sint Maarten 2018 NaN Sint Maarten 2019 NaN Timor-Lest 2016 NaN Timor-Lest 2022 NaN - - Removed values: 14 / 2739 (0.51%) country year relative_pages_tags_excluded Saint Martin (French part) 2018 NaN Saint Martin (French part) 2020 NaN Saint Martin (French part) 2022 NaN Sint Maarten (Dutch part) 2017 NaN Sint Maarten (Dutch part) 2020 NaN ~ Changed values: 2601 / 2739 (94.96%) country year relative_pages_tags_excluded - relative_pages_tags_excluded + India 2023 2470.757324 NaN Moldova 2014 94.517960 NaN Singapore 2020 164.049683 NaN Trinidad and Tobago 2023 82.084961 NaN Turkey 2019 1597.967285 NaN Legend: +New ~Modified -Removed =Identical Details Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet ``` Automatically updated datasets matching _weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk_ are not included

Edited: 2024-06-21 11:35:22 UTC Execution time: 3.61 seconds