apply SourceDocIter elsewhere

In #101 (C4 + TREC Health Misinformation 2021), I abstracted much of the annoying bits of writing an iterator over document sources into base classes. This should make adding new large datasets considerably easier, with less boilerplate. I should go back and see which prior document collections could be simplified by making use of this.

I believe the datasets that could benefit from this would be:

[ ] gov2
[ ] msmarco-passage-v2
[ ] tweets2013-ia
[ ] clueweb09 & clueweb12
[ ] Maybe even the standard docstore implementation?

allenai / ir_datasets

apply SourceDocIter elsewhere #102