Open BeccaPeake1998 opened 2 years ago
Hi Becca,
Thank you for your interest in the project! I started it at the end of May as my final project for a Data Science bootcamp at Spiced Academy here in Berlin and I never finished it. I'd love it if all the work I've done could finally be put to good use.
As you might have read in the repo description:
I exported three CSV files that you can now find in the repository: df_complete.csv : all the incidents of rightwing violence that I scraped from the Berliner Register website. df_lgbt.csv : all the incidents of LGBT violence that I filtered from df_complete df_map.csv : all the incidents from df_lgbt where I could find an unambiguous location using SpaCy and coordinates using GeoPy.
df_lgbt.csv and df_map also include information regarding the type of incident (check README.md for the description of the labels I used), which I determined manually and with the help of SpaCy and Regular Expressions.
Regarding the locations for the incidents, the job is far from done... you'll see that df_lgbt has way more incidents than df_map. This is because:
Hope this was helpful and you can advance the project. Let me know how it goes and if I can help further.
Best, Filipe
Rebecca Peake @.***> escreveu no dia quinta, 11/11/2021 à(s) 17:57:
Hello, I work for Datawrapper https://www.datawrapper.de/, a software for creating charts and graphs. I am interested in creating an interactive map using the data you collected for this project, and was wondering if you could send this to me in a CSV or JSON format? It will be posted on our blog https://blog.datawrapper.de/. I will of course reference you and your project.
Thank you, Becca
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fserro/LGBT-violence-Berlin/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASNRFYQ5UBLCXZWFAHBRPPLULPYWJANCNFSM5H25AV4Q .
Dear Filipe,
Thank you for making this data easily available. It's sad the website for this was never finished; it is such an exciting and important project, and the preview of the site looked really good.
I was wondering why there were less than the reported 1000 incidents in your filtered date, but your explanation makes sense here. I can imagine it's quite a task dealing with those ambiguous locations.
I can send you a link to the blog once it's done if you would like.
Kind regards, Rebecca Peake
On Fri, Nov 12, 2021 at 4:24 PM Filipe Serro @.***> wrote:
Hi Becca,
Thank you for your interest in the project! I started it at the end of May as my final project for a Data Science bootcamp at Spiced Academy here in Berlin and I never finished it. I'd love it if all the work I've done could finally be put to good use.
As you might have read in the repo description:
- I extracted the data from the Berliner Register website ( https://berliner-register.de/ );
- I filtered the incidents concerning LGBT violence using regular expressions;
- There are only incidents between the beginning of 2014 and the end of May 2021, when I extracted the data, but more recent incidents could be scraped from their website using the Python functions I wrote for that purpose in the end of the file notebook Berliner_Register.ipynb
I exported three CSV files that you can now find in the repository: df_complete.csv : all the incidents of rightwing violence that I scraped from the Berliner Register website. df_lgbt.csv : all the incidents of LGBT violence that I filtered from df_complete df_map.csv : all the incidents from df_lgbt where I could find an unambiguous location using SpaCy and coordinates using GeoPy.
df_lgbt.csv and df_map also include information regarding the type of incident (check README.md for the description of the labels I used), which I determined manually and with the help of SpaCy and Regular Expressions.
Regarding the locations for the incidents, the job is far from done... you'll see that df_lgbt has way more incidents than df_map. This is because:
- a few incidents don't mention any particular location
- most incidents mention more than one location
- GeoPy doesn't return coordinates for some locations
Hope this was helpful and you can advance the project. Let me know how it goes and if I can help further.
Best, Filipe
Rebecca Peake @.***> escreveu no dia quinta, 11/11/2021 à(s) 17:57:
Hello, I work for Datawrapper https://www.datawrapper.de/, a software for creating charts and graphs. I am interested in creating an interactive map using the data you collected for this project, and was wondering if you could send this to me in a CSV or JSON format? It will be posted on our blog https://blog.datawrapper.de/. I will of course reference you and your project.
Thank you, Becca
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fserro/LGBT-violence-Berlin/issues/1, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ASNRFYQ5UBLCXZWFAHBRPPLULPYWJANCNFSM5H25AV4Q
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/fserro/LGBT-violence-Berlin/issues/1#issuecomment-967200560, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVEEQY7MU65X5Q7QOPC2VLDULUWTDANCNFSM5H25AV4Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Hello, I work for Datawrapper, a software for creating charts and graphs. I am interested in creating an interactive map using the data you collected for this project, and was wondering if you could send this to me in a CSV or JSON format? It will be posted on our blog. I will of course reference you and your project.
Thank you, Becca