DocNow / twarc

A command line tool (and Python library) for archiving Twitter JSON
https://twarc-project.readthedocs.io
MIT License
1.36k stars 255 forks source link

Normalizing twarc2 commands #608

Open igorbrigadir opened 2 years ago

igorbrigadir commented 2 years ago

I'm finding that some command line commands are starting to diverge, and become inconsistent. This issue is just for investigating this and coordinating some changes i think we should make. Maybe some breaking changes? It would be great to align all the commands so that they have consistent commonalities and documentation and examples.

To begin with:

SamHames commented 2 years ago

I'm in favour of this, especially if it means we stop guessing or inferring what was intended.

This is probably enough of a breaking change that we'd want to have at least a major version bump, and ideally some nice managing of errors that we can tell people about the new syntax. This would also let us standardise the handling of usernames, so hopefully we don't have to deal with other instances like #568 and #542.

igorbrigadir commented 2 years ago

Maybe all commands that do bulk operations on an input text file vs a single argument should be like twarc2 bulk followers input.txt output.txt ? Just another idea.

SamHames commented 2 years ago

I like the ID of a bulk subcommand, I can definitely see how that would contribute to consistency: anything that takes an input file is a bulk command, everything that takes an argument on the command line is at the top level.

That also gives us a nicer migration path because we can decouple creating new (more consistent) commands under bulk from removing the old commands, allowing for some deprecation warnings or similar.