archive functionality - Githubissues

5v3n / tweetlr

tweetlr crawls twitter for a given term, extracts photos out of the collected tweets' short urls and posts the images to tumblr. nice!

http://tweetlr.5v3n.com

Other

18 stars 3 forks source link

archive functionality #12

Open afknapping opened 13 years ago

afknapping commented 13 years ago

tweets are lost within hours. but they are just 140 bytes + metadata. since they are grabbed and analyzed by tweetlr anyways, why not dump them in a simple text file? or a ultra-light database table for later export?

5v3n commented 13 years ago

thought about that too - but you had to configure either another service like mongohq / couchone or set up a local db on your server.

i'd prefer a schema free data storage in this case. if you persist on tables, you're on your own here ;-).

again, one could think of this as an optional feature which is enabled when the conf/tweetlr.yml contains lets say mongohq credentials.

btw - i'd love to have an feature request voting mechanism where you could +1 / -1 ideas...

afknapping commented 13 years ago

if writing in a text file is the easiest way to go, i'm all for it. there are probably tons of twitter-archiving tools out there already. so having the "log" just as a by-product would be a nice add-on. (it's always an option to re-parse that file and extract data if necessary).

proposal for formatting:

<status url> <line break> <message> <line break> <line break>

would be human-readable out of the box (and easy to parse if needed?).

for maximum convinience of all parties, tweetlr could send the file to the provided login email adress.

5v3n commented 13 years ago

g you sound a bit on the enterprisy side here ;-).

storing the tweets is a cool idea. you could even re-run the tumblr-posting to use new features for the older posts, too. like the new tagging feature that currently covers new posts only.

let's just use the json response & see if we add some tweetlr details. w/o having a closer look, i'd just write the twitter json response right to mongohq. this way, we have the accessing part covered already instead of mailing the text file via mail.

afknapping commented 13 years ago

well... mongohq would be another external dependency - which again has external dependencies (S2). not judging, just saying. if it is easier to implement than txt-dumping, and even brings other benefits too... sure. not tech-savvy enough to help in evaluation here :)