DocNow / twarc-csv

A plugin for twarc2 for converting tweet JSON into DataFrames and exporting to CSV.
MIT License
31 stars 10 forks source link

DataFrameConverter to single tweet #38

Closed nanne97 closed 2 years ago

nanne97 commented 2 years ago

I have a large set of tweets, and I would like to wrangle and write them to file as I go. DataFrameConverter would be ideal for this, but if I try to pass a tweet to it, I get an error message: TypeError: process() missing 1 required positional argument: 'objects'. I followed these instructions:

from twarc_csv import DataFrameConverter

json_objects = [...]

df = DataFrameConverter.process(json_objects)

passing converter a tweet or a page scraped as described in examples.

What am I doing wrong or can I not use this at all as I would like to?

igorbrigadir commented 2 years ago

Yeah it should work for you if you want to convert 1 tweet at a time but this will give you 1 pandas dataframe for it, so it's better to convers a chunk of a few hundred or even thousand tweets at a time to make it more efficient.

It's not very well documented yet, but something like this should work:

from twarc_csv import DataFrameConverter

converter = DataFrameConverter()

# if tweet is a single tweet object:
json_objects = [tweet]

df = converter.process(json_objects)
nanne97 commented 2 years ago

Ok, thank you so much! I didn't realise I they needed to be in a list!