cstephen / hashtag-count

Count hashtag occurrences over time using Twitter's Streaming API.
MIT License
7 stars 4 forks source link

No more results after ~ 1 week #1

Closed flegfleg closed 7 years ago

flegfleg commented 7 years ago

Hi, thanks for your script, i am using a variation of the "unlimited.js" tracking ~40 hashtags, and works fine.

But after around one week, the script returns 0 results for all the hashtags i am tracking… restarting does nothing, I have to create a new twitter app (credentials, keys) and enter that in the config in order to make it run again.

Any ideas how to solve this would be greatly appreciated. Thanks, Flo

cstephen commented 7 years ago

Hi @flegfleg, I'm in the process of trying to replicate this. I've had a process running for 3 days so far. I'll let you know if it's still reporting results after 1 week, but I'll keep it running indefinitely and see what I can find.

Since you needed to create new Twitter app credentials to get it working again, I'm wondering if your old credentials were blocked by Twitter. If so, I wonder what caused it. The Twitter Streaming API is supposed to keep connections open infinitely long unless something unusual happens. (See "Disconnections" on this this page.)

This module needs to do a better job of reporting disconnects and attempting to reconnect. I'll create a new issue for that.

flegfleg commented 7 years ago

Dear @cstephen, thanks for your reply! I am quite new to node.js, but if you find a way to add reporting, that would be great.

cstephen commented 7 years ago

Just a quick update, my process has been running for 10 days now and is still getting hashtag results. @flegfleg, are your original Twitter app credentials still blocked even after a week has passed?

I've explored the underlying twit module that the hashtag-count module uses for its Twitter Streaming API connections. The twit module has some built-in safeguards to limit the number of connection attempts in a short amount of time. It will wait longer to connect or reconnect if it knows Twitter is rate limiting it.

hashtag-count benefits from these same safeguards, but I think you can still run into trouble if you stop/start your script too often or try to run multiple instances of hashtag-count simultaneously using the same set of Twitter application credentials. I've run into situations where my script has had to wait 30+ seconds to reestablish a connection. So far I have not had my application credentials blocked permanently, however.

I've started working on issue #2. The next version of hashtag-count will communicate connects/disconnects and stop outputing false 0 results while in a disconnected state.

This issue (#1) will serve as a reminder that I need to warn hashtag-count users about Twitter's rate limiting / blocking in the README.

cstephen commented 7 years ago

@flegfleg, a new version of hashtag-count (1.1.0) has been published that no longer records false 0 results when the Twitter Streaming API connection is inactive. It also provides three new optional callback functions connectingCb, reconnectingCb, and connectedCb that allow you to react to connect/disconnect events. You can upgrade your version by running npm update.

See the new "Connection status callbacks and a warning about rate limiting" section of the README and the updated example scripts.

Also, for what it's worth, my test process has been running continuously for 14 days without experiencing any disconnects. I'm going to keep it running for as long as I can to see if any issues come up.

I was able to force disconnect events for development by setting my twit connection timeouts unsually low (10 seconds instead of 60 seconds), but I did that in a separate process.

flegfleg commented 7 years ago

Thanks a lot @cstephen! Quite busy with other projects right now, will check it out!

cstephen commented 7 years ago

I'm closing this issue since I've verified that a hashtag-count process can run longer than 1 week without experience any problems, and the latest version is now able to report disconnect/reconnect events.

I'm assuming the Twitter Streaming API credentials were simply blocked by Twitter in this case after some mysterious (undocumented) repeated connection threshold was crossed. I don't think we can ever be sure without some comprehensive timing experiments, but I don't think we can do that without abusing Twitter.

flegfleg commented 7 years ago

Sorry for my late answer @cstephen Thanks for your efforts, it works for me now, too. 👍