DocNow / catalog

A simple catalog of Twitter ID Datasets
http://catalog.docnow.io/
Other
28 stars 34 forks source link

please add iMCD: Twitter Data #48

Closed magdalenadrafi closed 5 years ago

magdalenadrafi commented 5 years ago

title: iMCD: Twitter Data creator: Urban Big Data Centre url: http://ubdc.gla.ac.uk/dataset/imcd-twitter-data published: 2018 – 10 - 10 added: 2018–10-10T15:59:07Z tweets: "65,000,000" dates: 2014-07-01 - 2015-11-30 tags: iMCD Glasgow BetterTogether CommonWealthGame GlasgowRoads Scotnight Indyref BBC VoteYeas NoThanks Scotland Train Undecided BBCWestScot DailyMailUK OpenGlasgow MetroUK description: > University of Glasgow’s unique Integrated Multimedia City Data (iMCD) Survey is a cross-sectional survey based on a sample of the general population in private residences across eight local authority areas of Glasgow and Clyde Valley. The purpose of the iMCD dataset was to provide a 360° overview of a life in the city, combining various datasets and methods of collection. The survey fieldwork was run by Ipsos MORI and took place between 15th April 2015 and 21st November 2015. This project was funded by ESRC and it was a result of collaboration across the University of Glasgow, Newcastle and Sheffield. The intention was to provide new innovative datasets with new methods and methodologies that could be used by the policy makers. iMCD consists of 5 main strands (Survey, GPS, life-logging devices, image analysis, textual media and multimedia data). The core of the iMCD is the Household Survey. Each of the five data strands are built on unique models of data sampling. As a part of the project, we also collected for example a sample of GPS and lifelogging sensors. Lifelogging sensors collected data through GPS devices and wearable cameras. Out of all participants 90 % agreed to carry the sensor device. Concurrent to the collection of the iMCD Household Survey data UBDC researchers have undertaken a significant information extraction exercise to capture data streams related to Glasgow from a variety of online sources. Twitter data comprises a large part of this collection, and this dataset comprises a selection of tweets during the period 1/12/14 - 30/11/2015 that arose from the greater Glasgow area. This, for example, may give insights into the citizens’ behaviour, reactions and moods in certain contexts or at particular times. The dataset can be queried through a bespoke online tool by specific hashtag or tweet term, in order to return statistical information or specific Tweet IDs. Captured tweets were those geolocated in Glasgow (based on a polygon around the geography), those from certain known Glasgow accounts (e.g. @BBCWestScot; @policescotland) and containing certain terms or hashtags (e.g. glasgow, or #glasgow2014). Please contact us if you wish to receive a step by step guide for 're-hydrating' the tweet ids. Tweet from Glasgow Users (user object) BBCWestScot DailyMailUK BBCScotWeather GlasgowCC Daily_Record GLA_Airport WeatherCast_UK STVGlasgow glabreakingnews GreaterGlasgPol OpenGlasgow TheEveningTimes trafficscotland policescotland Herald_editor TheScotsman scotairquality GdnScotland PeopleMakeGLA EverttHerald GlasgowSubway newsundayherlad EducationScot TheSunNewspaper TravelineScot BBCTravelScot CBItweets MetroUK FristinGlasgow PeopleMakeGLA FoEScot gtcs WhatsOnGlasgow scdiglobal edhubSctoland Heart_Glasgow

Tweet with certain terms or hashtags (entities object) glasgow BetterTogether GlasgowCC Yougov glasgow2014 CommonWealthGame GlasgowRoads scotnight indyref CWG2014 goforitscotland BBCScotWeather scotdecides VoteYes

NoThanks the45 Scotland yesScotland AlexSalmond CommonWealthGames NoBecause VoteNo Darling train undecided HopeOverFear PatronisingBTlady

In order to access iMCD: Twitter Data, please get in touch with UBDC https://www.ubdc.ac.uk/data-services/data-services

Tweet from Glasgow Users (user object)

BBCWestScot DailyMailUK BBCScotWeather GlasgowCC Daily_Record GLA_Airport WeatherCast_UK STVGlasgow glabreakingnews GreaterGlasgPol OpenGlasgow TheEveningTimes trafficscotland policescotland Herald_editor TheScotsman scotairquality GdnScotland PeopleMakeGLA EverttHerald GlasgowSubway newsundayherlad EducationScot TheSunNewspaper TravelineScot BBCTravelScot CBItweets MetroUK FristinGlasgow PeopleMakeGLA FoEScot gtcs WhatsOnGlasgow scdiglobal edhubSctoland Heart_Glasgow   Tweet with certain terms or hashtags  (entities object) glasgow BetterTogether GlasgowCC Yougov glasgow2014 CommonWealthGame GlasgowRoads scotnight indyref CWG2014 goforitscotland BBCScotWeather scotdecides VoteYeas NoThanks the45 Scotland yesScotland AlexSalmond CommonWealthGames NoBecause VoteNo Darling train undecided HopeOverFear PatronisingBTlady

edsu commented 5 years ago

Thanks for submitting this dataset @magdalenadrafi!

It looks like you almost got it working, but there were emdashes in the dates instead of hyphens, which was causing the data not to work. We are in the midst of planning how to improve the catalog so any feedback you have about trying to use it would be most welcome.

As for the data, the only question I have is if the tweet id dataset is available on the web or not. From what I can see at the URL you provided it doesn't look like it is downloadable and that they need to get in touch with you for access? If so this would be the first dataset we've made available through the tweet id dataset catalog where the tweet ids are not immediately downloadable.

magdalenadrafi commented 5 years ago

Dear Ed,

Many thanks for this! I was not sure why it was not working – so thank you for helping with this, much appreciated.

We just discussed iMCD datasets we held at UBDC and we will need only one sentence to be removed – which is Out of all participants 90% agreed to carry the sensor device. The rest of the description is alright, but this sentence should be removed.

My colleagues is currently developing a platform for the Twitter dataset we hold and initially this dataset was going to be available only through the users registering with UBDC first, but I will confirm with my manager and get back to you in regards to this asap.

In the meantime, may I kindly ask you to remove the following sentence Out of all participants 90% agreed to carry the sensor device.

Many thanks and I will get back to you asap,

With kind regards,

Magdalena

From: Ed Summers notifications@github.com Sent: 26 October 2018 15:13 To: DocNow/catalog catalog@noreply.github.com Cc: Magdalena Drafiova Magdalena.Drafiova@glasgow.ac.uk; Mention mention@noreply.github.com Subject: Re: [DocNow/catalog] please add iMCD: Twitter Data (#48)

Thanks for submitting this dataset @magdalenadrafihttps://github.com/magdalenadrafi!

It looks like you almost got it working, but there were emdashes in the dates instead of hyphens, which was causing the data not to work. We are in the midst of planning how to improve the catalog so any feedback you have about trying to use it would be most welcome.

As for the data, the only question I have is if the tweet id dataset is available on the web or not. From what I can see at the URL you provided it doesn't look like it is downloadable and that they need to get in touch with you for access? If so this would be the first dataset we've made available through the tweet id dataset catalog where the tweet ids are not immediately downloadable.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/DocNow/catalog/issues/48#issuecomment-433421693, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AoFbZDwsO9JySFAJTrkT0wr87D_XDkCmks5uoxhggaJpZM4X50BR.

edsu commented 5 years ago

I think that the sentence has been removed. Let me know if not!