Closed danielcarter closed 5 years ago
Hi @danielcarter,
have you been running into rate limits?
Best,
Erik
Hi @ErikBorra -- thanks for taking a look.
No -- no rate limit issues. The collections running are all pretty small and low-volume.
Hi @danielcarter
About the timeline, you mean you verified the user used the hashtag you were querying in their recent timeline, but it did not end up in your bin? About both datasets, how did you compare them? With a spreadsheet export, or with a MySQL query?
Is your TCAT installation fully up-to-date database wise? Did you try to run php upgrade.php
in the common/ directory?
To compare the datasets, I used spreadsheet exports. I've since looked into this more, pulling tweets for some users from the REST API to compare with what the streaming is giving. I still need to finish looking at that data, but there are some pretty large differences.
I do need to run the upgrade script, but I've been hesitant to stop the collections I have going. Is there any chance that would be causing the problem?
I'm closing this issue now, as there have been recent improvements/fixes in TCAT which may have solved this issue. You should rerun the experiment again to see if there is a big difference.
I have a question about some strange results I'm seeing. I have two collections based on hashtags that were started at the same time, around 8 months ago. Sometimes the hashtags are used together, and I checked today how many of hashtag A were in the set for hashtag B and vice versa. I assumed this should give the same number, if I got all of each set -- but the numbers are off by about 2,000. Additionally, I exported the recent tweets for a single user and checked against their timeline, and the dataset is missing quite a few.
Any ideas what could be going on? I'm running on an EC2 server, so I don't think the server has been down. The only thing I can think of is that a lot of the people using these hashags tweet a lot and tweet pretty similar content, so maybe Twitter is throwing some out before it goes over the API?
Any help would be really appreciated.