SpiderStrategies / node-tweet-stream

Node twitter module to hook into the public filter streaming, seamlessly updating the tracking keywords.
210 stars 43 forks source link

stops without any error after a period of time #11

Closed jjmaceda closed 9 years ago

jjmaceda commented 9 years ago

Hi, when I'm trying non common track like @cnet or tesla (which could have a couple of tweets with those words on it ever few minutes), after a couple of second it just stops, and I'm pretty sure there are more tweets available, no errors no nothing, but if I use a very common track like apple, it works perfectly (I'm assuming because it has hundreds of tweets with that word on it), any ideas?

I really like this package, is the most complete I have ever seen, the rest aren't taking into account the rate limits, etc.

nathanbowser commented 9 years ago

@jjmaceda I'll see if I can reproduce.

nathanbowser commented 9 years ago

Hi @jjmaceda. I just ran a little app that logs "tesla" tweets to the console. There's not a lot of data, but it's been running for over 5 minutes, and I'm still seeing tweets print to my console.

Can you come up with a test case or post some sample code that demonstrates the problem?

nathanbowser commented 9 years ago

Looks like the problem is after 90 seconds, thanks to @nkzawa's pr #12 Closing in favor of that PR

jjmaceda commented 9 years ago

Hey thanks for this, however I'm still facing the same problem, I dig a bit into the code and notice that when twitter send a blank line as the doc mention it

"The Streaming API will send a keep-alive newline every 30 seconds to prevent your application from timing out the connection."

the parser transform method is not called anymore, but the data is still coming, maybe something with the split method here??

this.parser = this.stream.pipe(split()).pipe(new Parser)

thanks!

nathanbowser commented 9 years ago

@jjmaceda I'm actually looking into this exact issue now! Will keep you posted.

nathanbowser commented 9 years ago

Do you get any events if you do this:

twitter.on('reconnect', function (msg) {
  console.log(msg)
})
nathanbowser commented 9 years ago

@jjmaceda I can't seem to reproduce the problem with new lines. I see new line feeds coming in and then future tweets seem to work just fine. There's a test for this, too. Can you try out this branch https://github.com/SpiderStrategies/node-tweet-stream/pull/14 and see if that helps things on your end?

I'd attach to the reconnect event and see if you're reconnecting frequently. You may be hitting twitters rate limits, too.

jjmaceda commented 9 years ago

ok this is what I'm doing to test, I'm using the new branch for this test using tesla,@cnet as tracks

on this pice of code

res.on('data', function (data) { console.log('data', data.toString()); clearTimeout(self.timeout) self.timeout = setTimeout(close, interval) })

I added a console when I got new data event, all works well and u can see the json string in the terminal, but when I got a data with an empty or maybe new line on it, the tweet event stops been calling but u will see the data is still coming in the terminal, make sense?

And I'm not getting any reconnect event for the moment

nathanbowser commented 9 years ago

Am I understanding this correctly?

You have a caller.js that looks like this:

var Twitter = require('node-tweet-stream')
  , t = new Twitter({
   ..
  })

t.on('tweet', function (tweet) {
  console.log(tweet.text)
})

t.on('error', function (err) {
  console.log('Oh no')
})

t.on('reconnect', function (msg) {
  console.log(msg)
})

t.track('tesla')
t.track('@cnet')

And in the src for this module, you modified it so you print each data event to the console

res.on('data', function (data) {
  console.log('data', data.toString());
  clearTimeout(self.timeout)
  self.timeout = setTimeout(close, interval)
})

When you run caller.js, you see a few tweets, then a pause, and a new line feed (every 30 seconds based on twitter's API). After you see the 'data ' message from modifying the source, you'll only see data events and logs from the src console.log, and not from your caller.js on('tweet')

I'm thinking sample output may look like this:

$node caller.js
data {text: 'Elon Musk loves Tesla' ...}
Elon Musk loves Tesla
data {text: 'Tesla Model D announcement tonight' ...}
Tesla Model D announcement tonight
data 

data {text: 'Tesla stock was a great buy 1.5 years ago' ...}
data {text: 'I should buy Tesla stock' ...}
jjmaceda commented 9 years ago

correct!! when I see the data empty is when the tweet events stops been calling

nathanbowser commented 9 years ago

Yea, I run that and I can't reproduce it, but I totally see where there could be an issue. I think it's an issue with timing and the reconnects.

What happens if you try this:

t.track('tesla', false)
t.track('@cnet', false)
t.reconnect()
nathanbowser commented 9 years ago

Another thing you could try... with your original code, run the app and then from another terminal see how many open connections you have to twitter. You could do something like this:

lsof -i |grep node

Either way, I can't reproduce this. If you can come up with a test case that fails, that'd be awesome.

jjmaceda commented 9 years ago

yep try that, same issue, ok I will create a test case, thanks!

jjmaceda commented 9 years ago

I try that last command, it seems that I have only one to the 443 port

node 54712 jjmaceda 12u IPv4 0xf1461e79aa6cc9b5 0t0 TCP *:hbci (LISTEN) node 54712 jjmaceda 13u IPv4 0xf1461e79aa6ce19d 0t0 TCP 192.168.0.12:49994->r-199-59-148-229.twttr.com:https (ESTABLISHED) node 54712 jjmaceda 15u IPv4 0xf1461e799d51b9b5 0t0 TCP 192.168.0.12:hbci->192.168.0.12:49995 (ESTABLISHED) node 54712 jjmaceda 16u IPv4 0xf1461e79aa65219d 0t0 TCP 192.168.0.12:hbci->192.168.0.12:49996 (ESTABLISHED)

nkzawa commented 9 years ago

Even after the https://github.com/SpiderStrategies/node-tweet-stream/pull/14, we still have the similar problem (frozen without any errors). What's the status of this issue?

nathanbowser commented 9 years ago

@nkzawa How long approximately until it starts freezing?

@jjmaceda Any luck producing a failing test?

nkzawa commented 9 years ago

@nathanbowser every few days or once per day

nkzawa commented 9 years ago

It may be a bit better now, but not sure.

jjmaceda commented 9 years ago

sorry for the delay, I get back to u with some test this afternoon

On Wed, Oct 15, 2014 at 10:08 AM, Nathan Bowser notifications@github.com wrote:

@nkzawa https://github.com/nkzawa How long approximately until it starts freezing?

@jjmaceda https://github.com/jjmaceda Any luck producing a failing test?

— Reply to this email directly or view it on GitHub https://github.com/SpiderStrategies/node-tweet-stream/issues/11#issuecomment-59231031 .

Juan José Maceda Luna Adobe Certified Instructor Adobe Certified Expert in Rich Internet Application Specialist Adobe Certified Expert in Web Specialist CS5

nkzawa commented 9 years ago

@jjmaceda thank you :)

nathanbowser commented 9 years ago

I'm looking into this again, and I'm starting to think there may be a problem when there's a network hiccup.

nathanbowser commented 9 years ago

I may retract my previous comment. That stuff looks like it's working.

@jjmaceda said they saw data come in for data events, but it's not emitting a 'tweet' event from the parser. That makes me think it's something with this.parser.unpipe(this)

nathanbowser commented 9 years ago

I put spun up a ec2 micro box and ran this tracking 'tesla' for a few hours and wasn't able to reproduce. Hopefully @jjmaceda can come up with reproducible steps.

nathanbowser commented 9 years ago

Good news... It took over 36 hours of a running process, but I finally caught this problem. Here's a gist of some output that I can use to create a test case and fix this thing. I'm betting the error "Write after end" is what causes this to stop emitting tweets.

https://gist.github.com/nathanbowser/7ba5cb63cd57c40aa528

Additionally, this module is leaking memory, but that can be fixed in another issue.

nkzawa commented 9 years ago

@nathanbowser Great!!! This must be the cause. Thank you very much :)

jjmaceda commented 9 years ago

@nathanbowser great news, is there are fix now? I tried the reconnect-errors branch but the error persist, thank anyway for all the effort!!!

nkzawa commented 9 years ago

It seems our problem was solved! Thank you very much for your help :)

nathanbowser commented 9 years ago

Awesome!

nathanbowser commented 9 years ago

Our process has been running for over 2 weeks and we haven't hit the stale problem, so hopefully this is good to go!