twintproject / twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
MIT License
15.66k stars 2.72k forks source link

Lat/Long location data #140

Closed cbjrobertson closed 6 years ago

cbjrobertson commented 6 years ago

I'd like to work on latitude and longitude location data into the twint outputs, in addition to nearest city, which is what it is apparently doing now. However, I don't want to duplicate your efforts if this is something you've already tried to do. Have @haccer or any other people working on the package, looked into doing this?

pielco11 commented 6 years ago

Really interesting but I do not enough time to work on it, if you would like to help us that could be really useful

I did not talk with Cody about this, and so I'm not sure if he is working on it... anyway feel free to write code here or (better) create a PR (to dev)

cbjrobertson commented 6 years ago

Ok, cool:) I'll look into it.

haccer commented 6 years ago

It seems Tweets don't display location info on them anymore... or maybe I cannot find them. We'll figure this out eventually.

cbjrobertson commented 6 years ago

Yeah, I looked through as well and couldn't find anything. It makes sense, for obvious privacy reasons that they wouldn't display such info publicly.

aaeissa commented 6 years ago

Coordinates used to be available at ['coordinates']['coordinates'] in the Tweet object. The docs still show this (https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object.html) but I think it's been removed as well.

I started a project last fall that relied heavily on those coordinates. I just came back to it recently and noticed it was not collecting any data. I thought I had something misconfigured.

Maybe the location info is being stripped because of the recent GDPR compliance deadlines?

pielco11 commented 6 years ago

It seems that Twitter removed the location info from the Tweet, but you can still search for Tweets near a place: near:"Venice, Italy" ; plus you have an argument within: 15ml that auto describe itself. Not a specific geocode but still something

cbjrobertson commented 6 years ago

The within argument allows fractions of km/ml, as well, so you can get quite precise.

On 5 Jun 2018, at 19:18, Francesco Poldi notifications@github.com wrote:

It seems that Twitter removed the location info from the Tweet, but you can still search for Tweets near a place: near:"Venice, Italy" ; plus you have an argument within: 15ml that auto describe itself. Not a specific geocode but still something

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

pielco11 commented 6 years ago

Yes, that's right... but the problem is that you will have to specify the location (localtion -> tweet) and you don't have viceversa (tweet -> location)

cbjrobertson commented 6 years ago

Yes, it does seem that way...

Cole

On Tue, Jun 5, 2018 at 8:30 PM, Francesco Poldi notifications@github.com wrote:

Yes, that's right... but the problem is that you will have to specify the location (localtion -> tweet) and you don't have viceversa (tweet -> location)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haccer/twint/issues/140#issuecomment-394831852, or mute the thread https://github.com/notifications/unsubscribe-auth/AVVeHuQHuwpKf7kcczjIBRef5qRqB0MCks5t5tw4gaJpZM4UZD_N .

haccer commented 6 years ago

Yeah, I a while ago i added --location to collect the location from a Twitter user's profile because of that.

Going to close this for now. When I get a crazy idea to try and get tweet attached location data i'll do it

Ali-khavanin commented 4 years ago

The within argument allows fractions of km/ml, as well, so you can get quite precise. On 5 Jun 2018, at 19:18, Francesco Poldi @.***> wrote: It seems that Twitter removed the location info from the Tweet, but you can still search for Tweets near a place: near:"Venice, Italy" ; plus you have an argument within: 15ml that auto describe itself. Not a specific geocode but still something — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

Hi there! if I may ask, I want to know how can I figure out if a crawled tweet is in a certain area or not? for example, I'm crawling 100,000 tweets which contains a keyword and i want to know which one of those 100,000 tweets are in "35.715298, 51.404343, 15km " what should I do?

i really appreciate your help @pielco11 (sorry for being annoying :sweat_smile: )

pielco11 commented 4 years ago

@Ali-khavanin don't worry

so 100k tweets are a lot, thus this is a hard case. I'd suggest you to split the searches as better as possible. For example you might run different scripts at the same time, but with different since/until, keywords, from/to args, and specifying that geo region as well. The newly crawled tweets will match the region, and the others will not be in it. Basically you have to re-run the query adding the near param