Migrate existing code for data sources

davidpomerenke commented 7 months ago

We already have code for:

ACLED
MediaCloud
DeReKo (this needs a major overhaul though)
Parliamentary speeches in EU, UK

kleinlennart commented 6 months ago

@davidpomerenke I looked through your code and saw that you only query the assoc_actor_1 field but not the assoc_actor_2 which can also yield matches for events co-organized by multiple protest groups.

https://github.com/SocialChangeLab/media-impact-monitor/blob/5c2d7c11177178a674d1424c3f482126defc4572/backend-python/media_impact_monitor/data_loaders/protest/acled.py#L67

Good reminder that we need to document sampling criteria stuff like this well and should probably add some kind of documentation page to the final website for scholarly rigor 📚

davidpomerenke commented 6 months ago

Here's a sample from ACLED from protests in the UK, only with entries where assoc_actor_2 is present:

Examples

> [{'date': Timestamp('2022-07-19 00:00:00'), > 'assoc_actor_1': 'XR: Extinction Rebellion', > 'assoc_actor_2': 'Government of the United Kingdom (2010-)', > 'notes': "On 19 July 2022, around 20 activists from Extinction Rebellion occupied the offices of the East Sussex County Council in Lewes to demand immediate action on the climate crisis. One protester was 'shoved onto her back' by a councilor and landed 'violently' on her back in the corridor. Police escorted the protesters out for having disrupted a meeting."}, > {'date': Timestamp('2023-04-16 00:00:00'), > 'assoc_actor_1': 'Women (Northern Ireland)', > 'assoc_actor_2': 'LGBTQ+ (Northern Ireland)', > 'notes': "On 16 April 2023, hundreds of people gathered at the Donegall Quay area in Belfast - South on the occasion of the 'Let Women Speak' demonstration. Anti-transgender rights activist Kellie Jay-Keen, known as Posie Parker, addressed the crowd. LGBTQ+ and trans rights activists held a counter-demonstration."}, > {'date': Timestamp('2022-08-17 00:00:00'), > 'assoc_actor_1': 'PA: Patriotic Alternative', > 'assoc_actor_2': 'LGBTQ+ (United Kingdom)', > 'notes': 'On 17 August 2022, hundreds of people associated with the far-right group Patriotic Alternative protested outside the Forum in Norwich ahead of the Drag Queen Story Hour event for children and called for canceling the event. Hundreds of people gathered for a counter-protest in support of the LGBTQ+ community and the event.'}, > {'date': Timestamp('2023-03-04 00:00:00'), > 'assoc_actor_1': '', > 'assoc_actor_2': 'SUTR: Stand Up To Racism', > 'notes': "On 4 March 2023, more than 100 people from far-right groups marched and gathered for a rally in Market Square in Dover against the accommodation of refugees. They were chanting, 'stop the invasion' and 'Lefty scum off our streets'. Protestors from SUTR also gathered for a counter-demonstration."}, > {'date': Timestamp('2022-10-16 00:00:00'), > 'assoc_actor_1': 'Chinese Group (United Kingdom)', > 'assoc_actor_2': 'Government of China (2012-); Labor Group (China)', > 'notes': 'On 16 October 2022, Hong Kong pro-democracy protesters gathered outside the Chinese Consulate in Manchester in response to the 20th National Congress of the Chinese Communist Party being held in China. A diplomat and staff from the Consulate clashed with protestors, ripped down pro-democracy posters, dragged a protestor inside the consulate, and beat him. Police forces intervened and rescued the protestor.'}, > {'date': Timestamp('2022-08-04 00:00:00'), > 'assoc_actor_1': 'LGBTQ+ (United Kingdom)', > 'assoc_actor_2': 'Protesters (United Kingdom)', > 'notes': "On 4 August 2022, people protested outside of a library in Hove ahead of the Drag Queen Story Hour event for children and called for canceling the event. An unknown number of people also gathered in support of the LGBTQ+ community and the event. Police forces arrested two citizens who were protesting in favor of the event after they got 'into an altercation' with anti-drag protesters."}, > {'date': Timestamp('2022-08-05 00:00:00'), > 'assoc_actor_1': '', > 'assoc_actor_2': 'LGBTQ+ (United Kingdom)', > 'notes': 'On 5 August 2022, more than a dozen people protested outside of a library in Portsmouth ahead of the Drag Queen Story Hour event for children and called for canceling the event. An unknown number of people also gathered in support of the LGBTQ+ community and the event.'}, > {'date': Timestamp('2023-04-16 00:00:00'), > 'assoc_actor_1': 'Jewish Group (United Kingdom); Palestinian Group (United Kingdom)', > 'assoc_actor_2': 'Israeli Group (United Kingdom)', > 'notes': "On 16 April 2023, Palestine supporters, members of the Palestinian community, and anti-zionist Jewish people held a rally in Whitehall in London - Westminster in solidarity with the Palestinian people. Protesters were chanting 'Free Palestine' and burned the flag of Israel. Israel supporters, including members of the Israeli community, gathered for a counter-demonstration in Downing Street. Police forces kept the two groups separate."}, > {'date': Timestamp('2020-08-22 00:00:00'), > 'assoc_actor_1': '', > 'assoc_actor_2': 'SUTR: Stand Up To Racism; Labor Group (United Kingdom); BLM: Black Lives Matter', > 'notes': "On 22 August 2020, around 850 people took part in two protests in Nottingham's Old Market Square. Around 450 far-right campaigners rallied for children and the plight of veterans. Around 400 counter-protesters from Stand Up To Racism and various trades unionists protested against racism and fascism and supported the Black Lives Matters movement. There was a large police presence at the scene."}, > {'date': Timestamp('2023-06-18 00:00:00'), > 'assoc_actor_1': '', > 'assoc_actor_2': 'SUTR: Stand Up To Racism; Labor Group (United Kingdom)', > 'notes': 'On 18 June 2023, eight far-right protesters gathered in Moray (Scotland) at the call of the Highland Division, a local far-right group against the accommodation of refugees in the city. Alek Yerbury, co-founder of the National Support Detachment (NSD) also addressed the crowd. Hundreds of SUTR affiliates and trade union activists also gathered for the counter-demonstration, denouncing racism.'}, > {'date': Timestamp('2023-09-02 00:00:00'), > 'assoc_actor_1': '', > 'assoc_actor_2': 'SUTR: Stand Up To Racism; UCU: University and College Union; Teachers (United Kingdom); Labor Group (United Kingdom)', > 'notes': 'On 2 September 2023, far-rights activists gathered in Birmingham (England) for a rally against the refugees. Anti-racism activists, trade unionists, and SUTR and UCU affiliates held a counter-demonstration in Birmingham (England) opposing recent activities of the far-rights group in town.'}, > {'date': Timestamp('2023-02-05 00:00:00'), > 'assoc_actor_1': 'Women (United Kingdom)', > 'assoc_actor_2': 'SNP: Scottish National Party; Government of the United Kingdom (2010-); LGBTQ+ (United Kingdom)', > 'notes': "On 5 February 2023, hundreds of people staged a rally called Standing for Women in Glasgow against the Scottish Government's gender recognition reform bill. Protestors also called for Scottish ministers to keep transgender criminals out of female prisons. The event was organized by a public figure, Kellie-Jay Keen. Hundreds of LGBT activists and anti-hatred campaigners also attended a counter-demonstration against the event to support the right to self-identification. A number of SNP MSPs joined the counter-protesters."}, > {'date': Timestamp('2022-07-02 00:00:00'), > 'assoc_actor_1': 'PA: Patriotic Alternative; For Britain Movement', > 'assoc_actor_2': 'Labour Party (United Kingdom); Government of the United Kingdom (2010-); SUTR: Stand Up To Racism', > 'notes': 'On 2 July 2022, members of the far-right, white nationalist groups For Britain and Patriotic Alternative, gathered for a rally in York city center, denouncing the reception of refugees. In response, around 100 anti-racism campaigners belonging to Stand Up To Racism gathered to stage a counter-protest along with Labour MP Rachael Maskell.'}, > {'date': Timestamp('2022-07-30 00:00:00'), > 'assoc_actor_1': 'Catholic Christian Group (Northern Ireland); Former Government of the United Kingdom (2010-); Protestant Christian Group (Northern Ireland); TUV: Traditional Unionist Voice', > 'assoc_actor_2': 'Government of the United Kingdom (2010-); LGBTQ+ (Northern Ireland); SDLP: The Social Democratic and Labour Party', > 'notes': "On 30 July 2022, a handful of people, including a former TUV councilor as well as Catholics and Protestants, gathered for the Parents Against Grooming protest outside the Mac in the city center in Belfast - North, against a family storytime event hosted by a drag queen as a part of the city's Pride Weekend celebrations, 'in opposition to child grooming.' Two dozen people, including SDLP councilor Seamas de Faoite, also gathered for a counter-protest in support of the event."}, > {'date': Timestamp('2023-05-27 00:00:00'), > 'assoc_actor_1': '', > 'assoc_actor_2': 'Antifa; SUTR: Stand Up To Racism; LGBTQ+ (United Kingdom)', > 'notes': 'On 27 May 2023, at the call of Turning Point UK, people staged a protest outside the Honor Oak pub in London - Lewisham against a drag queen storytelling event for children. Hundreds of anti-fascists and anti-racism activists also gathered outside the venue for the counter-protest to support the event, the LGBTQ+ community, and denounce racism and fascism. SUTR organized the protest.'}, > {'date': Timestamp('2023-07-01 00:00:00'), > 'assoc_actor_1': 'LGBTQ+ (United Kingdom)', > 'assoc_actor_2': 'Just Stop Oil', > 'notes': "On 1 July 2023, around 30,000 LGBTQ+ community members and their supporters marched from Hyde Park and Whitehall in London - Westminster (England) as part of the annual Pride March to celebrate diversity and the queer community and denounce discrimination against LGBTQ+ people. At around 13.30, a small group of protesters from Just Stop Oil disrupted the pride march by sitting down on the road, stopping the parade, over 'high-polluting' corporations that are sponsors the pride. Police Forces intervened and arrested seven Just Stop Oil protesters and the parade continued."}, > {'date': Timestamp('2023-04-22 00:00:00'), > 'assoc_actor_1': '', > 'assoc_actor_2': 'Labor Group (United Kingdom); Teachers (United Kingdom); Unison The Public Service Union; NEU: National Education Union', > 'notes': "On 22 April 2023, people gathered in Market Place in Long Eaton against the housing of asylum seekers in two of the city's hotels. More than 100 people, including teachers from NEU and members of Unison, gathered for a counter-demonstration in support of refugees."}, > {'date': Timestamp('2023-04-15 00:00:00'), > 'assoc_actor_1': 'ECP: English Constitution Party; English Group (United Kingdom)', > 'assoc_actor_2': 'SUTR: Stand Up To Racism; Unison The Public Service Union; Labor Group (United Kingdom)', > 'notes': 'On 15 April 2023, members of a group called the English Constitution Party staged a rally in a car park next to the mosque at Earlsheaton in Dewsbury as prayers attended the mosque for Ramadan prayers. Far-right activist Alek Yerbury also participated in the rally, which denounced the presence of Islam. Anti-racist activists, including SUTR members and trade unionists from Unison, held a counter-demonstration to celebrate diversity in the town.'}, > {'date': Timestamp('2022-04-15 00:00:00'), > 'assoc_actor_1': 'Just Stop Oil', > 'assoc_actor_2': 'Labor Group (United Kingdom); Rioters (United Kingdom)', > 'notes': 'On 15 April 2022, in the early hours, Just Stop Oil activists blocked access to the Navigator and Grays Oil Terminals in Grays to demonstrate against the fossil fuels industry. A lorry driver affected by the protest yanked one protester to express his frustration with the delay. Police arrested 28 people.'}]

It typically denotes the opposed party of the protest (e.g. antiracist groups when it is a far-right protest). So the protests are actually counter-protests to the assoc_actor_2 group, and I think we don't want to have them when we query for that group.

However, there is a second meaning of the field: That there is a demo by assoc_actor_1, and assoc_actor_2 further disrupts it. These again we would want. E. g.:

{'date': Timestamp('2023-07-01 00:00:00'), 'assoc_actor_1': 'LGBTQ+ (United Kingdom)', 'assoc_actor_2': 'Just Stop Oil', 'notes': "On 1 July 2023, around 30,000 LGBTQ+ community members and their supporters marched from Hyde Park and Whitehall in London - Westminster (England) as part of the annual Pride March to celebrate diversity and the queer community and denounce discrimination against LGBTQ+ people. At around 13.30, a small group of protesters from Just Stop Oil disrupted the pride march by sitting down on the road, stopping the parade, over 'high-polluting' corporations that are sponsors the pride. Police Forces intervened and arrested seven Just Stop Oil protesters and the parade continued."},

But I would rather go for avoiding false positives, and not query by the assoc_actor_2 field.

davidpomerenke commented 6 months ago

Documentation is a good idea 📚👍 These are specific to the data loaders, so maybe we can keep the documentation text in the same file or in the same folder as the respective data loaders, and then eventually also display them in the frontend. (Rather then storing them somewhere in the frontend, detached from the code.)

davidpomerenke commented 6 months ago

Some stats about assoc_actor_2:

out of 6628 protests in the UK from 2020-2023, there's only 52 with assoc_actor_2
in Germany, out of 17229 protests, 413 have an assoc_actor_2
in the US, out of 58454 protests, 1312 have an assoc_actor_2
in Somalia, out of 346 protests, 5 have an assoc_actor_2
for the query df["notes"].str.contains(r"climate|oil|extinction|future", regex=True, case=False) there are 1343 results in the UK, and only 1 (the one above) where a climate group occurs as assoc_actor_2 but not as assoc_actor_1. There are no cases of a protest against a climate group.

This might look different for other countries though, since the codebook does not give explicit instructions, and there's different region teams.

kleinlennart commented 6 months ago

Great, thanks for looking into this and the clarification!

kleinlennart commented 6 months ago

We can add some of your notes to the docs #35

maybe we can keep the documentation text in the same file or in the same folder as the respective data loaders, and then eventually also display them in the frontend.

I'm thinking about which tool might be able to facilitate that 🤔

kleinlennart commented 6 months ago

Closed with #28

davidpomerenke commented 6 months ago

Regarding tool: We could store the docs as a Python string, or read them with Python from a README file, and serve them via API to the frontend.

Though maybe this is overkill and we can just have it in the frontend 😄

SocialChangeLab / media-impact-monitor

Migrate existing code for data sources #11