ttezel / twit

Twitter API Client for node (REST & Streaming API)
4.31k stars 568 forks source link

Streaming API silently not working #9

Closed nicola closed 12 years ago

nicola commented 12 years ago

Hello there, (yesterday it worked, but I tried today and it didnt, I didnt change anything in the code),

here is my code:

var Twit = require('twit'); var T = new Twit({ consumer_key: 'vroFy2FvOR8En7saib1A4Q' , consumersecret: '**' , accesstoken: '10818002-' , access_token_secret: '*' });

var stream = T.stream('statuses/sample'); stream.on('tweet', function (tweet) { console.log(tweet); }); stream.on('error', function (tweet) { console.log(tweet); });

stream.on('limitation', function (tweet) { console.log(tweet); });

if I do node twit.js I get nothing and the process ends, any hints?

ttezel commented 12 years ago

I ran your exact code above locally and it works for me (I used my own oauth credentials). Can you check your credentials?

nicola commented 12 years ago

I did check my credentials, on my laptop it doesnt work if I upload it on heroku it works, is there a way where I can debug ?

thanks n

ttezel commented 12 years ago

okay can you check if both your laptop and your heroku server are running the latest version of twit? (0.1.6)

nicola commented 12 years ago

yep, same, could it be a twitter api IP blocking? if yes how can I debug that?

ttezel commented 12 years ago

yes, twitter's API could be blocking you. Try doing a REST API request like so:

T.get('statuses/home_timeline', function (err, reply) {
  if (err)
    return console.log('err', err)

  console.log('reply', reply)
})

the err object should give you a HTTP status code and any additional info provided by twitter. You should still be able to make unauthenticated requests by the way.

nicola commented 12 years ago

i didn't get any error. REST Apis works streaming not, they just stopped on the server too. :(

nicola commented 12 years ago

How do I check if I'm blocked or not? is there any work around one could do? (e.g. also if I use another APP/User it still doesnt work)

zeiban commented 12 years ago

I would like to second this issue. I'm running into the same problem. Sometimes it will work for hours and then suddenly stop working. Sometimes when I stop node and restart it will start working again but not always. I don't think its a rate limit and I'm not blocked as I can still make non-streaming requests. I'm running node v0.8.4 on a windows desktop. I thought maybe is was a limit with the nodes request pool but twit is the only code making requests.

ttezel commented 12 years ago

Hey, this usage looks fine. Are you running the latest version of twit?

On Wed, Jun 27, 2012 at 5:30 AM, nicolagreco < reply@reply.github.com

wrote:

Hello there, (yesterday it worked, but I tried today and it didnt, I didnt change anything in the code),

here is my code:

var Twit = require('twit'); var T = new Twit({ consumer_key: 'vroFy2FvOR8En7saib1A4Q' , consumer_secret: 'pRJXMsAkc9zgVNUIWdWZQcyoekjBXzun4WTP5DmK2w' , access_token: '10818002-mTYGgcJV6Aq0hyPxG4TgyM5Izxy438b5WSYu58zli' , access_token_secret: 'vMRlH3gxHlmzZBbYMiqNoXHPWtpSe7iK8Fcvjg' });

var stream = T.stream('statuses/sample'); stream.on('tweet', function (tweet) { console.log(tweet); }); stream.on('error', function (tweet) { console.log(tweet); });

stream.on('limitation', function (tweet) { console.log(tweet); });

if I do node twit.js I get nothing and the process ends, any hints?


Reply to this email directly or view it on GitHub: https://github.com/ttezel/twit/issues/9

zeiban commented 12 years ago

Yes, I just ran a 'npm install twit' just in case. I've got it running now with some extra debug output in the twit module. Hopefully it will give me a clue to what's going on when it happens again.

zeiban commented 12 years ago

I don't know if this is related but started listening on the 'error' events to try and figure out whats going on. I noticed that I'm getting the occasional parse error. It looks like it trying to parse a partial json string. Here is an example.

Error parsing twitter reply: d_image_url_https":"https:\/\/si0.twimg.com\/profile_background_images\/583032745\/0bpxjch9h6zpm2r16v7x.jpeg","profile_sidebar_fill_color":"ffffff","url":"http:\/\/blog.logitech.com\/","name":"Logitech","listed_count":1337,"profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/1675825649\/Logitech-logo-stacked-100k_normal.jpg","id":10503302,"verified":true,"time_zone":"Pacific Time (US & Canada)","utc_offset":-28800,"profile_sidebar_border_color":"ddffcc"},"id":231463598440988673}

This obviously this isn't a complete JSON string. Almost like the first part was lost.

ttezel commented 12 years ago

Yes, this is normal. Twitter sometimes doesn't send parseable JSON data. Twit parses any reply received from twitter, and the 'error' event is called for replies from Twitter that are not parseable. Otherwise if the reply is parseable, the object would be emitted from the 'tweet', 'limit' or other events (as specified in the readme).

Cheers,

Tolga

On Fri, Aug 3, 2012 at 3:05 PM, zeiban < reply@reply.github.com

wrote:

I don't know if this is related but started listening on the 'error' events to try and figure out whats going on. I noticed that I'm getting the occasional parse error. It looks like it trying to parse a partial json string. Here is an example.

Error parsing twitter reply: d_image_url_https":"https:\/\/si0.twimg.com \/profile_background_images\/583032745\/0bpxjch9h6zpm2r16v7x.jpeg","profile_sidebar_fill_color":"ffffff","url":"http:\/\/ blog.logitech.com \/","name":"Logitech","listed_count":1337,"profile_image_url_https":"https:\/\/ si0.twimg.com\/profile_images\/1675825649\/Logitech-logo-stacked-100k_normal.jpg","id":10503302,"verified":true,"time_zone":"Pacific Time (US & Canada)","utc_offset":-28800,"profile_sidebar_border_color":"ddffcc"},"id":231463598440988673}

This obviously inst a complete JSON string. Almost like the first part was lost.


Reply to this email directly or view it on GitHub: https://github.com/ttezel/twit/issues/9#issuecomment-7491253

zeiban commented 12 years ago

So, the tweet is just lost if twitter decides to send partial json data? The tweet from my stream in the above example from logitech was never emitted as a 'tweet'. I only know about it because I caught the 'error' event for it.

ttezel commented 12 years ago

Twit follows twitter's guidelines for parsing tweets from the streaming API (https://dev.twitter.com/docs/streaming-apis/processing#Parsing_responses). The tweet is not 'lost', if you want to recover the information in that tweet, you have to have a callback listening on the 'error' event, and attempt to parse the partial JSON string from twitter. I have not implemented this yet. Is that something you would want as a user? Also, did you not find a valid version of that same tweet while listening on the 'tweet' event?

Regards,

Tolga

On Fri, Aug 3, 2012 at 3:32 PM, zeiban < reply@reply.github.com

wrote:

So, the tweet is just lost if twitter decides to send partial json data? The tweet from my stream in the above example from logitech was never emitted as a 'tweet'. I only know about it because I caught the 'error' event for it.


Reply to this email directly or view it on GitHub: https://github.com/ttezel/twit/issues/9#issuecomment-7492094

zeiban commented 12 years ago

No, that specific tweet from logitech was never emitted as a 'tweet' event. I've got other examples as well. I seems to happen about once every hour. Maybe it depends on the tweet contents?. I never noticed it until I started logging the 'error' event for the other issue. Would it help to try and pull the entire tweet using the REST api? I'll have to wait for one that doesn't cut off the id_str so I can get the ID.

kai-koch commented 12 years ago

I can confirm problems with the streaming API, too I connected to the sample endpoint for an hour.

The tweet through put dropped from 23 tweets to 7,5 Tweets per second in the end. If I run this test from home, I start with slower through put and with faster decline of the through put rate. (Slower CPU and lower Bandwidth)

Watching the Memory consumption of the node process, my guess is:

I will run a 24 hour test on the server to see if I get an out of memory error. Just started: [2012-08-05T09:28:22.291Z] Collector Server started up [2012-08-05T09:29:04.354Z] Current rate: 24.035 Chunks/per second Total count(1011) Total Error (0) [2012-08-05T09:29:48.419Z] Current rate: 23.256 Chunks/per second Total count(2003) Total Error (0) [2012-08-05T09:30:41.490Z] Current rate: 21.581 Chunks/per second Total count(3004) Total Error (0) [2012-08-05T09:31:45.557Z] Current rate: 19.693 Chunks/per second Total count(4003) Total Error (0) [2012-08-05T09:32:59.624Z] Current rate: 18.047 Chunks/per second Total count(5005) Total Error (0) [2012-08-05T09:34:08.693Z] Current rate: 17.324 Chunks/per second Total count(6001) Total Error (0) [2012-08-05T09:35:38.020Z] Current rate: 16.072 Chunks/per second Total count(7003) Total Error (0) [2012-08-05T09:37:10.363Z] Current rate: 15.17 Chunks/per second Total count(8011) Total Error (0) [2012-08-05T09:38:46.570Z] Current rate: 14.434 Chunks/per second Total count(9011) Total Error (0)

As you can see in the first few minutes the through put is already declining. I will report back on this in 24 Hours. I will "beatify" and review the code in my own fork and look if I can spot a memory leak there. or in the oauth modul.

On the "partial json string" Issue: It is a bug in the parser! benbarnett has a fork of twit, where the parser problem is fixed. See: https://github.com/benbarnett/twit/blob/master/lib/parser.js

kai-koch commented 12 years ago

[2012-08-05T10:19:44.580Z] Current rate: 8.112 Chunks/per second Total count(25004) Total Error (0) [2012-08-05T10:23:51.197Z] Current rate: 7.811 Chunks/per second Total count(26003) Total Error (0) [2012-08-05T10:28:12.649Z] Current rate: 7.53 Chunks/per second Total count(27036) Total Error (0) [2012-08-05T10:32:45.391Z] Current rate: 7.249 Chunks/per second Total count(28003) Total Error (0) [2012-08-05T11:20:04.756Z] Current rate: 4.343 Chunks/per second Total count(29109) Total Error (0)

I terminate the test rate declined to fast,

zeiban commented 12 years ago

Great, I'm glad I was able to help identify the issue or at least point you in the right direction. I'll take a look at benbarnett's parser fix. Thanks!

kai-koch commented 12 years ago

I wrote a test to track the stream performance. I hate to admit it, but the memory leak I reported was caused by the changes I made to my local codebase.

When running the test against a clean install of twit, it performs at about 55 chunks / per second, with 5% cpu load and about 30 mb of ram consumption. Which seems about right on my old notebook.

To get the test see: https://github.com/kai-koch/twit/blob/master/tests/streamPerformance.js It runs with the current Version of twit. Don't mind the new handler for warning, status_withheld, user_withheld. They do not fire in the unmodified version of twit.

kai-koch commented 12 years ago

I fixed the parser in my fork, seems that String.slice leaks memory in node.js or blocks or something. The parser might be optimized with using regular expressions instead of String.split, but I am to dumb to get the Regex right. :-/

matteoagosti commented 12 years ago

I tried a slightly different approach in my fork, but as you did I only limited changes to parser.js https://github.com/matteoagosti/twit

matteoagosti commented 12 years ago

@kai-koch I runned your benchmarking test and with the sample stream I've got an average of 65 Chunks/per second, 0.4% CPU consumption and 16MB of memory allocated.

By changing the end point to "statuses/filter" and tracking "google" I've got an average of 3.501 Chunks/per second, 0.5% CPU consumption and 18MB of memory allocated.

kai-koch commented 12 years ago

@matteoagosti I am running your parser currently on winXP on a single core P4 with 2.60 Gh from home. I get about 30 Messages per second. memory is around 20 mb with peak at 30mb and ~2% cpu usage +-2%

Your Parser does not check, if you get more than one Tweet in one chunk, so i get errors, if two tweets are present in one chunk.

[2012-08-21T08:23:22.584Z] Current rate: 30.063 Chunks/per second Total(76659) Error(3) Tweets(70058) Deletes(6598) Warnings(0) Limits(0) Scrub_geos(0) Status_withhelds(0) User_withhelds(0)

matteoagosti commented 12 years ago

You are right actually it fails when 2 tweets are in the same chunk. To prevent parsing of already parsed data I was starting from the incoming chunk rather than from scratch. Simply getting rid of this fixes the issue.

kai-koch commented 12 years ago

@matteoagosti: 2 short remarks on your code:

line 7 -13

var Parser = module.exports = function ()  { 
  this.message = '';

  EventEmitter.call(this);
};

util.inherits(Parser, EventEmitter);

Wouldn't it be better to declare this.message as

Parser.prototype.message = ''

after util.inherits(...); I ask this because I do not know exactly how the call to the parrent constructor and util.inherits(..) works. Do they overwrite all prototype properties defined, before they were called, or do they simply add the parrent prototype properties?

on line 31: the start variable is declared a second time.

matteoagosti commented 12 years ago

Declaring the variable as you suggested would lead to a shared variable across Parser instances (even though there is actually only one instance).

Regarding line 31 you are right, I forgot to update it as it is coming from the original code; I tried to minimize the number of changes to ease the pull request.

I also updated again my fork and increased the parse speed by preventing a useless splice https://github.com/matteoagosti/twit/commit/e9070149fdc04f8d457b17265ac9b4f11cddeb8c

I created a test to check what really happens during each chunk parse and I noticed that it may happen that you get 8 simultaneous Tweet :P Very unpredictable

kai-koch commented 12 years ago

@matteoagosti: I am currently running my latest parser version https://github.com/kai-koch/twit/blob/master/lib/parser.js, and got about 10% better performance than with your old parser. and about 13 mb memory usage (peak 27mb)

When it's done, I try your new one.

[2012-08-21T10:19:22.721Z] Current rate: 33.691 Chunks/per second Total(37396) Error(0) Tweets(34376) Deletes(3020) Warnings(0) Limits(0) Scrub_geos(0) Status_withhelds(0) User_withhelds(0)

matteoagosti commented 12 years ago

@kai-koch I finished testing my new parser, these are the results [2012-08-21T11:42:18.212Z] Current rate: 36.204 Chunks/per second Total(260495) Error(0) Tweets(243673) Deletes(16822) Warnings(0) Limits(0) Scrub_geos(0) Status_withhelds(0) User_withhelds(0) Server shutdown. Uptime: 119.92008333333334 min Starttime was: 2012-08-21T09:42:23.007Z

I'll test your new one also.

p.s. Node 0.8.5, MacBook Air 1.7GHz i5 Avg CPU 1.3%, Avg Mem 14MB

kai-koch commented 12 years ago

@matteoagosti: I think twitter is messing with us. :) I get about 42 chunks per second currently, with your new parser. :)

[2012-08-21T11:58:20.680Z] Current rate: 42.117 Chunks/per second Total(82970) Error(0) Tweets(77552) Deletes(5418) Warnings(0) Limits(0) Scrub_geos(0) Status_withhelds(0) User_withhelds(0)

ps: node 0.8.6

matteoagosti commented 12 years ago

@kai-koch LOL, yep :) The stream is not guaranteed to go always at the same pace, that's why we notice the differences. However I think that in a time span of 2 hours you get more or less a good overview. In general I think that the difference in our solutions is mainly into the way we do the parse:

We could benchmark both approaches. However right now I'm fine with any of the two as finally I got back the stream working :P

kai-koch commented 12 years ago

@matteoagosti: Found a bug in my parser: On low traffic streams Twitter sends '\r\n' to keep the connection alive. So a check for empty array elements is needed or we would get a Syntax Error: "Unexpected end of input" from JSON. Fix: https://github.com/kai-koch/twit/commit/cbb7bfe3787c2912767155477e7e46ea6ab93203

Your parser does that already.

ttezel commented 12 years ago

Closing this issue as I merged in updates to the parser from @matteoagosti.

zeiban commented 12 years ago

Great!

nicola commented 12 years ago

I continue on getting nothing from Twitter :(

var Twit = require('twit')

var T = new Twit({ consumer_key: 'xxx' , consumer_secret: 'xxx' , access_token: 'xxx' , access_token_secret: 'xxx' });

var stream = T.stream('statuses/sample')

stream.on('tweet', function (tweet) { console.log(tweet) })

ttezel commented 12 years ago

hey @nicolagreco, your usage of twit seems fine. Most likely the reason why you aren't getting tweets is one of two reasons:

1) your Oauth credentials are invalid, or 2) you are opening multiple streams to the Twitter API, and twit is closing the one you're currently running.

Can you try running the code below? What statusCode do you get? And are you getting any 'disconnect' messages?

Before you run the code though, make sure to update twit to the latest version (run npm update twit). You should be using v1.0.2.

var T = new Twit({
    consumer_key: 'xxx'
  , consumer_secret: 'xxx'
  , access_token: 'xxx'
  , access_token_secret: 'xxx'
})

var stream = T.stream('statuses/sample')

stream.on('tweet', function (tweet) {
  console.log(tweet)
})

stream.on('disconnect', function (disconn) {
  console.log('disconnect')
})

stream.on('connect', function (conn) {
  console.log('connecting')
})

stream.on('reconnect', function (reconn, res, interval) {
  console.log('reconnecting. statusCode:', res.statusCode)
})
kumarabhirup commented 6 years ago

Hey, as you said, we can't open many streams to the Twitter API and Twit closes the streamer currently running. None of my streamers are working or some do work for some time. This must be a problem that you illustrated. Is there a way to open more than 2 Streams in one js file?

maziyarpanahi commented 6 years ago

@KumarAbhirup there is no problem with having multiple streams in the same file or multiple files as long as all these streams have their own Twitter Application! Twitter won't allow connecting to one application multiple times and it will close it. Create multiple applications, use the credentials for each stream, and open them anywhere you like :)

kumarabhirup commented 6 years ago

Hey, there were two streams running. I completely shut down one. Why is second not working? any idea?

kai-koch commented 6 years ago

If you are running the streams in one file, the streams use most likely the same IP. My guess is twitter does not allow 2 streams from the same IP-Adress even if they have a different Auth-Token each.

You would need a second IP Adress, even if it is just a virtual server on the same machine with a different IP. Twitter is very strict when it comes to give stuff away freely. Dig through the the TOS of Twitter, if you can not set up a 2nd IP, to see if they allow multiple-connections from one IP-adress.