twintproject / twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
MIT License
15.66k stars 2.72k forks source link

[CONTRIB] Dockerfile and Splunk Support #579

Closed dmuth closed 4 years ago

dmuth commented 4 years ago

Hi!

I learned of Twint awhile ago, and wanted to play around with it. However, since I am very heavily into Docker these days, I couldn't find a Dockerized version, so I decided to make one:

https://github.com/dmuth/splunk-twint

You can run the Dockerized version of Twint right from the command line as follows:

bash <(curl -s https://raw.githubusercontent.com/dmuth/twint-splunk/master/twint) -u dmuth --year 2020 --since 2019-01-01

Along the way, I discovered that the Pandas module, which I wasn't using when calling Twint from the command line, was adding minutes to my build time. So I tried programmatically removing references to Pandas from the code when building the Docker container, and much to my surprise, it worked! This reduces the build time on my older iMac from approximately 15 minutes to 30 seconds. :-)

But wait, there's a more! I'm a big time user of Splunk at my day job (enterprise logging/analytics), and seeing that Splunk has a free license for up to 500 MB/day of data, I also use it out of the office as my analytics tool of choice. So I decided to build a Splunk app to ingest and visualize tweets downloaded with Twint. That app is also included with my Dockerized version of Twint.

Let me know what you think--and if you'd like to borrow my Dockerfiles, feel free! I also found a bug along the way, but I will report that in a separate issue. :-)

Best,

-- Doug

pielco11 commented 4 years ago

We could create a repository for this, sounds good to you? I'd give you write-rights so that you can push, edit, etc. anytime

dmuth commented 4 years ago

Hmm, so like a separate repo under https://github.com/twintproject? Sure, why not? If we could get twint-splunk as the name, that would be great, and I could push future work there.

Thanks!

-- Doug

pielco11 commented 4 years ago

I created the repo named as you suggested and added you as collaborator with write rights, you should get a notification soon!

dmuth commented 4 years ago

Alright, looks like I have access to https://github.com/twintproject/twint-splunk now.

Since it's under your organization, do you have any guidelines (technical or otherwise) for how you'd like to see stuff pushed there? I was planning on starting with pushing a copy of my current repo, and see if GitHub will let me copy future commits/branches from https://github.com/dmuth/splunk-twint over to https://github.com/twintproject/twint-splunk as well.

Also, once it's up, I was going to mention it on my Twitter feed as well as write something up on my blog, unless you had other plans? If so, please let me know!

Thanks,

-- Doug

pielco11 commented 4 years ago

As of now there are not guidelines to follow, the only requests that I'd like to make are:

Feel free to create a post on your blog and to give me the link so that I'll add it to our README

About moving the code, you can add twint-splunk as remote of your local splunk-twint and then push to it (twint-splunk). Otherwise you can just copy and paste the code, remember to not copy the .git directory. Those are just my suggestions, feel free to do as you may prefer!

dmuth commented 4 years ago

Okay, code has been uploaded to https://github.com/twintproject/twint-splunk

Most of the code is shell, and I did my best to ensure it was documented.

Let me know if you think the README is sufficient, if not I can work on it. I'll write a blog post in a week or two, as I have a few unrelated things to clear off my plate first. :-)

-- Doug

pielco11 commented 4 years ago

At a first glance it's amazing! I hope to try it out this weekend!

Please join osint.team (with either Twitter or Github account) so that you can easily collect feedback from the OSINT community.