Closed SaschaStenger closed 5 years ago
@SaschaStenger Thanks for investing the issue and letting us know the improvements. If you have already made the change to speed up the download, you can create a pull request otherwise I will do the changes accordingly.
@mdepak I have updated the tweet collection process, so that it now calls up to 100 tweets per call. Also implemented the fix concerning the c.long on Windows. Both are implemented in the cloned repo
@SaschaStenger Thank you for suggesting an efficient Twitter API. Fixed the issue in https://github.com/KaiDMML/FakeNewsNet/commit/3c1ae3c41b32845243db08cac4ec9a9f7c7a43b3
@mdepak Thank you for fixing the issue. If I would have known how pull requests work, i would have liked to handle it this way (i'm new to trying to contribute to projects like this). Also your changes in moving the chunking to the utils packages seams a more elegant solution.
I have been running the code non stop for about two weeks now and I do get the feeling, that somehow it will take an even longer time to get the dataset ready.
And when posting a question about data collection limits on the twitter dev forum, it was pointed out ,that the code is using sub optimal lookup for the tweet gathering. Forum post I wanted to bring this to attention, so that the collection process could be sped up for everyone using this dataset.