balajis / twitter-export

Tool for mass export of Twitter followers to Substack, Ghost, or email list.
MIT License
159 stars 9 forks source link

Candidate: Probot #13

Open steveblackmon opened 4 years ago

steveblackmon commented 4 years ago

A while back I built a prototype twitter bot to demonstrate how to use the newly announced and still in beta ‘Twitter Account Activity API’. ‘Probot’ could connect to twitter on behalf of a single user and load information about their profile, posts, and direct messages into the local process.

image

In addition to the webhook needed to establish a bi-direction real-time feed with the twitter api, it exposed some additional endpoint to view user data and perform simple aggregations in-memory, and was able to reply to incoming messages programmatically.

I presented this work at the Austin Twitter Developer Meetup (which I organize) and committed much of it to Apache Streams.

https://www.meetup.com/Austin-Twitter-Developer-Community/events/243185192/

The next steps to make the probot more useful were obvious at the time - give it a database, a user interface, and the ability to process arbitrarily large datasets. I did that over the past week and I’m pretty happy with probot v0.2 - which I believe meets the requirements to be a candidate for the bounty.

https://github.com/steveblackmon/probot/tree/balaji

Probot runs on a small collection of docker containers (all OSS). It is still a single-user application, though with more engineering could turn it into a hosted SaaS application whose users would not need to create their own twitter application or interact with anything aside from a web browser.

image

Once your follower data is collected (max speed is 20 x 15 x 4 = 1200 followers per hour, single shot) and loaded, you can browse, filter, search, and sort the list and send your intro message to up to 1000 profiles per day. All of the data can be queried via SQL or the embedded database http api for external integrations.

Screenshot 2020-06-22 19 38 15

image

image

Screenshot 2020-06-22 19 36 06

Responses you receive while the application is running (along with any other direct messages or account activity events) are preserved in application logs with timestamps and other details, allowing you to prove consent to contact.

I’m working now to add the ability to browse, filter, and search all of your direct message conversations, and to parse consent keywords and contact details like emails or phone numbers provided into structured fields.

I would love for a community to form around this tool - as one has around Apache Streams to manage the challenge of social media data and API interoperability. Streams contains API integrations to many more social networks beyond Twitter, support for programming on top of social network ‘export’ archive files, as well as to third-party identity resolution services demographic, psychographic, geographic, emails, phone numbers, etc… can be purchased. Incorporating these integrations into Probot would increase its utility many-fold.

https://github.com/apache/streams

image

Thank you for considering this solution. If anyone is interested in collaborating in any way, please email me sblackmon at apache dot org

steveblackmon commented 4 years ago

Some updates:

image