GateNLP / python-gatenlp

Python text processing, pattern matching, and NLP framework
https://gatenlp.github.io/python-gatenlp/
Apache License 2.0
63 stars 8 forks source link

Implement Twitter document source #96

Open johann-petrak opened 3 years ago

johann-petrak commented 3 years ago

This has been partly done for old-format files. Need to figure out how this changes for new-style result JSON.

See also https://github.com/twitterdev/Twitter-API-v2-sample-code

greenwoodma commented 3 years ago

If you are talking about the new v2 API output, then the JSON is not really conducive to loading as single tweets as the JSON object is quite complex with bits of the information spread across multiple places (i.e. not all nicely nested to give one object per tweet).

I've implemented code that converts the v2 back to v1.1 as part of this private repo https://github.com/GateNLP/Twitter-API-v2 so that you can pull from the new API but use all the existing tools.

johann-petrak commented 3 years ago

Thanks - I already had the impression that v2 made this incredibly more complex but was not sure if maybe I was missing something.