data-diff: ❌ Found differences
```diff
= Dataset garden/news/2024-05-08/guardian_mentions
= Table guardian_mentions
~ Dim country
+ + New values: 14 / 2721 (0.51%)
year country
2018 Saint Martin (French part)
2020 Saint Martin (French part)
2022 Saint Martin (French part)
2017 Sint Maarten (Dutch part)
2020 Sint Maarten (Dutch part)
- - Removed values: 32 / 2721 (1.18%)
year country
2018 Saint Martin
2018 Sint Maarten
2019 Sint Maarten
2016 Timor-Lest
2022 Timor-Lest
~ Dim year
+ + New values: 14 / 2721 (0.51%)
country year
Saint Martin (French part) 2018
Saint Martin (French part) 2020
Saint Martin (French part) 2022
Sint Maarten (Dutch part) 2017
Sint Maarten (Dutch part) 2020
- - Removed values: 32 / 2721 (1.18%)
country year
Saint Martin 2018
Sint Maarten 2018
Sint Maarten 2019
Timor-Lest 2016
Timor-Lest 2022
~ Column num_pages_mentions (new data, changed data)
+ + New values: 14 / 2721 (0.51%)
country year num_pages_mentions
Saint Martin (French part) 2018 20
Saint Martin (French part) 2020 43
Saint Martin (French part) 2022 29
Sint Maarten (Dutch part) 2017 21
Sint Maarten (Dutch part) 2020 5
- - Removed values: 32 / 2721 (1.18%)
country year num_pages_mentions
Saint Martin 2018 20
Sint Maarten 2018 6
Sint Maarten 2019 1
Timor-Lest 2016 49
Timor-Lest 2022 114
~ Changed values: 18 / 2721 (0.66%)
country year num_pages_mentions - num_pages_mentions +
East Timor 2017 43
East Timor 2022 114
Saint Martin (French part) 2023 35
United States Virgin Islands 2016 27
United States Virgin Islands 2020 21
~ Column num_pages_mentions_per_million (changed metadata, new data, changed data)
- - {}
+ + title: Number of pages in the Guardian that mention a country (per million people)
+ + description_short: Number of pages in the Guardian that mention a particular country, normalised by the population of the
+ + country.
+ + origins:
+ + - producer: The Guardian
+ + title: Attention to each country in The Guardian's articles (raw mentions)
+ + description: |-
+ + Aggregate estimates on the number of entries that talk about each country and year.
+ +
+ + The data was obtained by querying The Guardian's Open Platform.
+ +
+ + An entry or page in The Guardian is considered to be about a certain country if that particular country is mentioned in the text. To this end, we have used a set of country name variations to ensure that we capture all the entries. Nonetheless, this is not a perfect method and some entries might be missed.
+ + citation_full: The Guardian, Open Platform
+ + url_main: https://open-platform.theguardian.com/access/
+ + date_accessed: '2024-05-07'
+ + date_published: '2024'
+ + license:
+ + name: The Guardian terms of service
+ + url: https://www.theguardian.com/help/terms-of-service
+ + - producer: Various sources
+ + title: Population
+ + description: |-
+ + Our World in Data builds and maintains a long-run dataset on population by country, region, and for the world, based on various sources.
+ +
+ + You can find more information on these sources and how our time series is constructed on this page: https://ourworldindata.org/population-sources
+ + citation_full: |-
+ + The long-run data on population is based on various sources, described on this page: https://ourworldindata.org/population-sources
+ + attribution: Population based on various sources (2023)
+ + attribution_short: Population
+ + url_main: https://ourworldindata.org/population-sources
+ + date_accessed: '2023-03-31'
+ + date_published: '2023-03-31'
+ + license:
+ + name: CC BY 4.0
+ + licenses:
+ + - name: Creative Commons BY 4.0
+ + url: https://docs.google.com/document/d/1-RmthhS2EPMK_HIpnPctcXpB0n7ADSWnXa5Hb3PxNq4/edit?usp=sharing
+ + - name: CC BY 3.0
+ + url: https://dataportaal.pbl.nl/downloads/HYDE/HYDE3.2/readme_release_HYDE3.2.1.txt
+ + - name: CC BY 3.0 IGO
+ + url: http://creativecommons.org/licenses/by/3.0/igo/
+ + unit: pages per million people
+ + processing_level: major
+ + presentation:
+ + topic_tags:
+ + - Uncategorized
+ + description_processing: |-
+ + Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first defining a set of country name variations for each country, and then look for content on The Guardian with an explicit mention to these names.
+ +
+ +
+ + 1. Get all country name variations:
+ + - Obtain all the country name variations using our standard name list.
+ + - Our list may not cover all cases, and may contain some names that are not valid on The Guardian API (e.g. names with symbols like ';' are not supported). Therefore, we clean this list.
+ +
+ + 2. For each country, obtain the number of pages using each set of name variations. Steps:
+ + - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?q=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`.
+ +
+ + For mor details, please refer to the snapshot script.
+ + New values: 14 / 2721 (0.51%)
country year num_pages_mentions_per_million
Saint Martin (French part) 2018 590.336182
Saint Martin (French part) 2020 1319.949707
Saint Martin (French part) 2022 911.491089
Sint Maarten (Dutch part) 2017 501.181366
Sint Maarten (Dutch part) 2020 114.579033
- - Removed values: 32 / 2721 (1.18%)
country year num_pages_mentions_per_million
Saint Martin 2018 NaN
Sint Maarten 2018 NaN
Sint Maarten 2019 NaN
Timor-Lest 2016 NaN
Timor-Lest 2022 NaN
~ Changed values: 2551 / 2721 (93.75%)
country year num_pages_mentions_per_million - num_pages_mentions_per_million +
Cameroon 2022 NaN 10.173908
Curacao 2014 NaN 17.853529
Ecuador 2023 NaN 11.379571
Latvia 2017 NaN 56.269985
Tokelau 2016 NaN 1382.170044
~ Column num_pages_mentions_relative (changed metadata, new data, changed data)
+ + {}
- - title: Share of pages in The Guardian that mention a country
- - description_short: Share of pages in The Guardian that that mention a particular country.
- - origins:
- - - producer: The Guardian
- - title: Attention to each country in The Guardian's articles (raw mentions)
- - description: |-
- - Aggregate estimates on the number of entries that talk about each country and year.
- -
- - The data was obtained by querying The Guardian's Open Platform.
- -
- - An entry or page in The Guardian is considered to be about a certain country if that particular country is mentioned in the text. To this end, we have used a set of country name variations to ensure that we capture all the entries. Nonetheless, this is not a perfect method and some entries might be missed.
- - citation_full: The Guardian, Open Platform
- - url_main: https://open-platform.theguardian.com/access/
- - date_accessed: '2024-05-07'
- - date_published: '2024'
- - license:
- - name: The Guardian terms of service
- - url: https://www.theguardian.com/help/terms-of-service
- - unit: pages per 100,000 pages
- - presentation:
- - topic_tags:
- - - Uncategorized
- - description_processing: |-
- - Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first defining a set of country name variations for each country, and then look for content on The Guardian with an explicit mention to these names.
- -
- -
- - 1. Get all country name variations:
- - - Obtain all the country name variations using our standard name list.
- - - Our list may not cover all cases, and may contain some names that are not valid on The Guardian API (e.g. names with symbols like ';' are not supported). Therefore, we clean this list.
- -
- - 2. For each country, obtain the number of pages using each set of name variations. Steps:
- - - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?q=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`.
- -
- - For mor details, please refer to the snapshot script.
+ + New values: 14 / 2721 (0.51%)
country year num_pages_mentions_relative
Saint Martin (French part) 2018 NaN
Saint Martin (French part) 2020 NaN
Saint Martin (French part) 2022 NaN
Sint Maarten (Dutch part) 2017 NaN
Sint Maarten (Dutch part) 2020 NaN
- - Removed values: 32 / 2721 (1.18%)
country year num_pages_mentions_relative
Saint Martin 2018 11.948002
Sint Maarten 2018 3.584401
Sint Maarten 2019 0.607371
Timor-Lest 2016 22.733494
Timor-Lest 2022 58.200977
~ Changed values: 2643 / 2721 (97.13%)
country year num_pages_mentions_relative - num_pages_mentions_relative +
Eswatini 2020 11.738843 NaN
Libya 2017 382.171539 NaN
Mayotte 2014 3.108569 NaN
Saudi Arabia 2018 632.049316 NaN
Yemen 2022 118.444092 NaN
~ Column num_pages_tags (new data)
+ + New values: 14 / 2721 (0.51%)
country year num_pages_tags
Saint Martin (French part) 2018
Saint Martin (French part) 2020
Saint Martin (French part) 2022
Sint Maarten (Dutch part) 2017
Sint Maarten (Dutch part) 2020
- - Removed values: 32 / 2721 (1.18%)
country year num_pages_tags
Saint Martin 2018
Sint Maarten 2018
Sint Maarten 2019
Timor-Lest 2016
Timor-Lest 2022
~ Column num_pages_tags_per_million (changed metadata, new data, changed data)
- - {}
+ + title: Number of pages in the Guardian with a country tag (per million people)
+ + description_short: |-
+ + Number of pages in the Guardian that are tagged with a country-related label, normalised by the population of the country.
+ + origins:
+ + - producer: The Guardian
+ + title: Attention to each country in The Guardian's articles (tags)
+ + description: |-
+ + Aggregate estimates on the number of entries that talk about each country and year.
+ +
+ + The data was obtained by querying The Guardian's Open Platform.
+ +
+ + An entry or page in The Guardian is considered to be about a certain country if that particular country if it is tagged with a country-related label. To this end, we have used a set of tags for each country. Nonetheless, this is not a perfect method and some entries might be missed.
+ + citation_full: The Guardian, Open Platform
+ + url_main: https://open-platform.theguardian.com/access/
+ + date_accessed: '2024-05-07'
+ + date_published: '2024'
+ + license:
+ + name: The Guardian terms of service
+ + url: https://www.theguardian.com/help/terms-of-service
+ + - producer: Various sources
+ + title: Population
+ + description: |-
+ + Our World in Data builds and maintains a long-run dataset on population by country, region, and for the world, based on various sources.
+ +
+ + You can find more information on these sources and how our time series is constructed on this page: https://ourworldindata.org/population-sources
+ + citation_full: |-
+ + The long-run data on population is based on various sources, described on this page: https://ourworldindata.org/population-sources
+ + attribution: Population based on various sources (2023)
+ + attribution_short: Population
+ + url_main: https://ourworldindata.org/population-sources
+ + date_accessed: '2023-03-31'
+ + date_published: '2023-03-31'
+ + license:
+ + name: CC BY 4.0
+ + licenses:
+ + - name: Creative Commons BY 4.0
+ + url: https://docs.google.com/document/d/1-RmthhS2EPMK_HIpnPctcXpB0n7ADSWnXa5Hb3PxNq4/edit?usp=sharing
+ + - name: CC BY 3.0
+ + url: https://dataportaal.pbl.nl/downloads/HYDE/HYDE3.2/readme_release_HYDE3.2.1.txt
+ + - name: CC BY 3.0 IGO
+ + url: http://creativecommons.org/licenses/by/3.0/igo/
+ + unit: pages per million people
+ + processing_level: major
+ + presentation:
+ + topic_tags:
+ + - Uncategorized
+ + description_processing: |-
+ + Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first getting all the tags for a country, and then getting the number of articles that have those tags.
+ +
+ +
+ + 1. Obtain all tags that concern a country:
+ + - Obtain all the tag pages that have a title starting with a country name: a query like "https://content.guardianapis.com/tags?web-title=spain", for Spain. As a result we obtain a mapping that tells us for each country the list of tags (e.g. "Spain: ") in use.
+ + - We work with a list of ~240 countries.
+ + - Getting the right country names has been an iterative process, trying to align our standard country names with the Guardian's.
+ +
+ + 2. For each country, obtain the number of pages using each set of tags. Steps:
+ + - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?tags=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`.
+ +
+ + For mor details, please refer to the snapshot script.
+ + New values: 14 / 2721 (0.51%)
country year num_pages_tags_per_million
Saint Martin (French part) 2018 NaN
Saint Martin (French part) 2020 NaN
Saint Martin (French part) 2022 NaN
Sint Maarten (Dutch part) 2017 NaN
Sint Maarten (Dutch part) 2020 NaN
- - Removed values: 32 / 2721 (1.18%)
country year num_pages_tags_per_million
Saint Martin 2018 NaN
Sint Maarten 2018 NaN
Sint Maarten 2019 NaN
Timor-Lest 2016 NaN
Timor-Lest 2022 NaN
~ Changed values: 2547 / 2721 (93.61%)
country year num_pages_tags_per_million - num_pages_tags_per_million +
Canada 2014 NaN 8.840655
Gibraltar 2018 NaN 826.269226
Japan 2016 NaN 3.078889
Latvia 2016 NaN 6.080638
Tokelau 2013 NaN 0.000000
~ Column num_pages_tags_relative (changed metadata, new data, changed data)
+ + {}
- - title: Share of pages in the Guardian with a country tag
- - description_short: Share of pages in The Guardian that are tagged with a country-related label.
- - origins:
- - - producer: The Guardian
- - title: Attention to each country in The Guardian's articles (tags)
- - description: |-
- - Aggregate estimates on the number of entries that talk about each country and year.
- -
- - The data was obtained by querying The Guardian's Open Platform.
- -
- - An entry or page in The Guardian is considered to be about a certain country if that particular country if it is tagged with a country-related label. To this end, we have used a set of tags for each country. Nonetheless, this is not a perfect method and some entries might be missed.
- - citation_full: The Guardian, Open Platform
- - url_main: https://open-platform.theguardian.com/access/
- - date_accessed: '2024-05-07'
- - date_published: '2024'
- - license:
- - name: The Guardian terms of service
- - url: https://www.theguardian.com/help/terms-of-service
- - unit: pages per 100,000 pages
- - presentation:
- - topic_tags:
- - - Uncategorized
- - description_processing: |-
- - Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first getting all the tags for a country, and then getting the number of articles that have those tags.
- -
- -
- - 1. Obtain all tags that concern a country:
- - - Obtain all the tag pages that have a title starting with a country name: a query like "https://content.guardianapis.com/tags?web-title=spain", for Spain. As a result we obtain a mapping that tells us for each country the list of tags (e.g. "Spain: ") in use.
- - - We work with a list of ~240 countries.
- - - Getting the right country names has been an iterative process, trying to align our standard country names with the Guardian's.
- -
- - 2. For each country, obtain the number of pages using each set of tags. Steps:
- - - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?tags=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`.
- -
- - For mor details, please refer to the snapshot script.
+ + New values: 14 / 2721 (0.51%)
country year num_pages_tags_relative
Saint Martin (French part) 2018 NaN
Saint Martin (French part) 2020 NaN
Saint Martin (French part) 2022 NaN
Sint Maarten (Dutch part) 2017 NaN
Sint Maarten (Dutch part) 2020 NaN
- - Removed values: 32 / 2721 (1.18%)
country year num_pages_tags_relative
Saint Martin 2018 NaN
Sint Maarten 2018 NaN
Sint Maarten 2019 NaN
Timor-Lest 2016 NaN
Timor-Lest 2022 NaN
~ Changed values: 2634 / 2721 (96.80%)
country year num_pages_tags_relative - num_pages_tags_relative +
Argentina 2013 226.496490 NaN
Aruba 2016 0.619191 NaN
Central African Republic 2017 10.622900 NaN
Congo 2018 57.331131 NaN
Taiwan 2020 50.219105 NaN
~ Column relative_pages_mentions (changed metadata, new data, changed data)
- - {}
+ + title: Share of pages in The Guardian that mention a country
+ + description_short: Share of pages in The Guardian that that mention a particular country.
+ + origins:
+ + - producer: The Guardian
+ + title: Attention to each country in The Guardian's articles (raw mentions)
+ + description: |-
+ + Aggregate estimates on the number of entries that talk about each country and year.
+ +
+ + The data was obtained by querying The Guardian's Open Platform.
+ +
+ + An entry or page in The Guardian is considered to be about a certain country if that particular country is mentioned in the text. To this end, we have used a set of country name variations to ensure that we capture all the entries. Nonetheless, this is not a perfect method and some entries might be missed.
+ + citation_full: The Guardian, Open Platform
+ + url_main: https://open-platform.theguardian.com/access/
+ + date_accessed: '2024-05-07'
+ + date_published: '2024'
+ + license:
+ + name: The Guardian terms of service
+ + url: https://www.theguardian.com/help/terms-of-service
+ + unit: pages per 100,000 pages
+ + presentation:
+ + topic_tags:
+ + - Uncategorized
+ + description_processing: |-
+ + Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first defining a set of country name variations for each country, and then look for content on The Guardian with an explicit mention to these names.
+ +
+ +
+ + 1. Get all country name variations:
+ + - Obtain all the country name variations using our standard name list.
+ + - Our list may not cover all cases, and may contain some names that are not valid on The Guardian API (e.g. names with symbols like ';' are not supported). Therefore, we clean this list.
+ +
+ + 2. For each country, obtain the number of pages using each set of name variations. Steps:
+ + - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?q=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`.
+ +
+ + For mor details, please refer to the snapshot script.
+ + New values: 14 / 2721 (0.51%)
country year relative_pages_mentions
Saint Martin (French part) 2018 11.948002
Saint Martin (French part) 2020 22.944101
Saint Martin (French part) 2022 14.805511
Sint Maarten (Dutch part) 2017 12.196963
Sint Maarten (Dutch part) 2020 2.667919
- - Removed values: 32 / 2721 (1.18%)
country year relative_pages_mentions
Saint Martin 2018 NaN
Sint Maarten 2018 NaN
Sint Maarten 2019 NaN
Timor-Lest 2016 NaN
Timor-Lest 2022 NaN
~ Changed values: 2661 / 2721 (97.79%)
country year relative_pages_mentions - relative_pages_mentions +
Bahamas 2015 NaN 45.872166
El Salvador 2017 NaN 56.919163
Guernsey 2022 NaN 28.079418
Liechtenstein 2021 NaN 40.499123
Saudi Arabia 2023 NaN 686.867859
~ Column relative_pages_mentions_excluded (changed metadata, new data, changed data)
- - {}
+ + title: Share of pages in The Guardian that mention a country (excludes UK, US, Australia)
+ + description_short: Share of pages in The Guardian that are tagged with a country-related label. Excludes US, UK and Australia.
+ + origins:
+ + - producer: The Guardian
+ + title: Attention to each country in The Guardian's articles (raw mentions)
+ + description: |-
+ + Aggregate estimates on the number of entries that talk about each country and year.
+ +
+ + The data was obtained by querying The Guardian's Open Platform.
+ +
+ + An entry or page in The Guardian is considered to be about a certain country if that particular country is mentioned in the text. To this end, we have used a set of country name variations to ensure that we capture all the entries. Nonetheless, this is not a perfect method and some entries might be missed.
+ + citation_full: The Guardian, Open Platform
+ + url_main: https://open-platform.theguardian.com/access/
+ + date_accessed: '2024-05-07'
+ + date_published: '2024'
+ + license:
+ + name: The Guardian terms of service
+ + url: https://www.theguardian.com/help/terms-of-service
+ + unit: pages per 100,000 pages
+ + presentation:
+ + topic_tags:
+ + - Uncategorized
+ + description_processing: |-
+ + Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first defining a set of country name variations for each country, and then look for content on The Guardian with an explicit mention to these names.
+ +
+ +
+ + 1. Get all country name variations:
+ + - Obtain all the country name variations using our standard name list.
+ + - Our list may not cover all cases, and may contain some names that are not valid on The Guardian API (e.g. names with symbols like ';' are not supported). Therefore, we clean this list.
+ +
+ + 2. For each country, obtain the number of pages using each set of name variations. Steps:
+ + - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?q=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`.
+ +
+ + For mor details, please refer to the snapshot script.
+ +
+ + This estimates exclude the UK, US, and Australia from the total number of pages. The reason for this is because the Guardian is a UK-based newspaper, and it is expected to have a higher number of articles about the UK, US, and Australia.
+ + New values: 14 / 2721 (0.51%)
country year relative_pages_mentions_excluded
Saint Martin (French part) 2018 20.384451
Saint Martin (French part) 2020 39.068180
Saint Martin (French part) 2022 24.335392
Sint Maarten (Dutch part) 2017 21.586283
Sint Maarten (Dutch part) 2020 4.542811
- - Removed values: 32 / 2721 (1.18%)
country year relative_pages_mentions_excluded
Saint Martin 2018 NaN
Sint Maarten 2018 NaN
Sint Maarten 2019 NaN
Timor-Lest 2016 NaN
Timor-Lest 2022 NaN
~ Changed values: 2628 / 2721 (96.58%)
country year relative_pages_mentions_excluded - relative_pages_mentions_excluded +
Bouvet Island 2019 NaN 0.000000
Jamaica 2017 NaN 328.933838
Rwanda 2019 NaN 151.658157
Saudi Arabia 2023 NaN 1138.211426
Vatican 2013 NaN 450.137909
~ Column relative_pages_tags (changed metadata, new data, changed data)
- - {}
+ + title: Share of pages in the Guardian with a country tag
+ + description_short: Share of pages in The Guardian that are tagged with a country-related label.
+ + origins:
+ + - producer: The Guardian
+ + title: Attention to each country in The Guardian's articles (tags)
+ + description: |-
+ + Aggregate estimates on the number of entries that talk about each country and year.
+ +
+ + The data was obtained by querying The Guardian's Open Platform.
+ +
+ + An entry or page in The Guardian is considered to be about a certain country if that particular country if it is tagged with a country-related label. To this end, we have used a set of tags for each country. Nonetheless, this is not a perfect method and some entries might be missed.
+ + citation_full: The Guardian, Open Platform
+ + url_main: https://open-platform.theguardian.com/access/
+ + date_accessed: '2024-05-07'
+ + date_published: '2024'
+ + license:
+ + name: The Guardian terms of service
+ + url: https://www.theguardian.com/help/terms-of-service
+ + unit: pages per 100,000 pages
+ + presentation:
+ + topic_tags:
+ + - Uncategorized
+ + description_processing: |-
+ + Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first getting all the tags for a country, and then getting the number of articles that have those tags.
+ +
+ +
+ + 1. Obtain all tags that concern a country:
+ + - Obtain all the tag pages that have a title starting with a country name: a query like "https://content.guardianapis.com/tags?web-title=spain", for Spain. As a result we obtain a mapping that tells us for each country the list of tags (e.g. "Spain: ") in use.
+ + - We work with a list of ~240 countries.
+ + - Getting the right country names has been an iterative process, trying to align our standard country names with the Guardian's.
+ +
+ + 2. For each country, obtain the number of pages using each set of tags. Steps:
+ + - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?tags=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`.
+ +
+ + For mor details, please refer to the snapshot script.
+ + New values: 14 / 2721 (0.51%)
country year relative_pages_tags
Saint Martin (French part) 2018 NaN
Saint Martin (French part) 2020 NaN
Saint Martin (French part) 2022 NaN
Sint Maarten (Dutch part) 2017 NaN
Sint Maarten (Dutch part) 2020 NaN
- - Removed values: 32 / 2721 (1.18%)
country year relative_pages_tags
Saint Martin 2018 NaN
Sint Maarten 2018 NaN
Sint Maarten 2019 NaN
Timor-Lest 2016 NaN
Timor-Lest 2022 NaN
~ Changed values: 2634 / 2721 (96.80%)
country year relative_pages_tags - relative_pages_tags +
Argentina 2013 NaN 226.496490
Aruba 2016 NaN 0.619191
Central African Republic 2017 NaN 10.622900
Congo 2018 NaN 57.331131
Taiwan 2020 NaN 50.219105
~ Column relative_pages_tags_excluded (changed metadata, new data, changed data)
- - {}
+ + title: Share of pages in the Guardian with a country tag (excludes UK, US, Australia)
+ + description_short: Share of pages in The Guardian that are tagged with a country-related label. Excludes US, UK and Australia.
+ + origins:
+ + - producer: The Guardian
+ + title: Attention to each country in The Guardian's articles (tags)
+ + description: |-
+ + Aggregate estimates on the number of entries that talk about each country and year.
+ +
+ + The data was obtained by querying The Guardian's Open Platform.
+ +
+ + An entry or page in The Guardian is considered to be about a certain country if that particular country if it is tagged with a country-related label. To this end, we have used a set of tags for each country. Nonetheless, this is not a perfect method and some entries might be missed.
+ + citation_full: The Guardian, Open Platform
+ + url_main: https://open-platform.theguardian.com/access/
+ + date_accessed: '2024-05-07'
+ + date_published: '2024'
+ + license:
+ + name: The Guardian terms of service
+ + url: https://www.theguardian.com/help/terms-of-service
+ + unit: pages per 100,000 pages
+ + presentation:
+ + topic_tags:
+ + - Uncategorized
+ + description_processing: |-
+ + Getting the number of articles/entries talking about a certain country has no straightforward answer, since there can be different strategies. The strategy for this indicator is based on first getting all the tags for a country, and then getting the number of articles that have those tags.
+ +
+ +
+ + 1. Obtain all tags that concern a country:
+ + - Obtain all the tag pages that have a title starting with a country name: a query like "https://content.guardianapis.com/tags?web-title=spain", for Spain. As a result we obtain a mapping that tells us for each country the list of tags (e.g. "Spain: ") in use.
+ + - We work with a list of ~240 countries.
+ + - Getting the right country names has been an iterative process, trying to align our standard country names with the Guardian's.
+ +
+ + 2. For each country, obtain the number of pages using each set of tags. Steps:
+ + - For each country and year we get all content metadata: a query like "https://content.guardianapis.com/search?tags=...&from-date=2020-01-01&to-date=2020-12-31" for year 2020. The count of pages is in the property `response.total`.
+ +
+ + For mor details, please refer to the snapshot script.
+ +
+ + This estimates exclude the UK, US, and Australia from the total number of pages. The reason for this is because the Guardian is a UK-based newspaper, and it is expected to have a higher number of articles about the UK, US, and Australia.
+ + New values: 14 / 2721 (0.51%)
country year relative_pages_tags_excluded
Saint Martin (French part) 2018 NaN
Saint Martin (French part) 2020 NaN
Saint Martin (French part) 2022 NaN
Sint Maarten (Dutch part) 2017 NaN
Sint Maarten (Dutch part) 2020 NaN
- - Removed values: 32 / 2721 (1.18%)
country year relative_pages_tags_excluded
Saint Martin 2018 NaN
Sint Maarten 2018 NaN
Sint Maarten 2019 NaN
Timor-Lest 2016 NaN
Timor-Lest 2022 NaN
~ Changed values: 2601 / 2721 (95.59%)
country year relative_pages_tags_excluded - relative_pages_tags_excluded +
India 2023 NaN 2470.757324
Moldova 2014 NaN 94.517960
Singapore 2020 NaN 164.049683
Trinidad and Tobago 2023 NaN 82.084961
Turkey 2019 NaN 1597.967285
+ + Table avg_10y
+ + Column num_pages_tags_10y_avg
+ + Column num_pages_mentions_10y_avg
+ + Column relative_pages_tags_10y_avg
+ + Column relative_pages_tags_excluded_10y_avg
+ + Column relative_pages_mentions_10y_avg
+ + Column relative_pages_mentions_excluded_10y_avg
+ + Column num_pages_tags_per_million_10y_avg
+ + Column num_pages_mentions_per_million_10y_avg
Legend: +New ~Modified -Removed =Identical Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet
```
Automatically updated datasets matching _weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk_ are not included
Tracking issue
Add decadal averages.