gwu-libraries / TweetSets

Service for creating Twitter datasets for research and archiving.
MIT License
25 stars 2 forks source link

T135 create tests #135 #146

Closed dolsysmith closed 3 years ago

dolsysmith commented 3 years ago

Unit tests now include a separate suite for the Spark loader. It tests the output of columns for both the Elasticsearch index and the CSV extract.

To run them, build the loader container and run python -m unittest (within the container). You will see some Spark warnings (safe to ignore these).

The non-Spark tests should still run from outside the loader container as expected. (I did not update the non-Spark tests, on the assumption that we may not be using the non-Spark loader very much, but we probably should make the code in models.py consistent with the Spark code in a future release. There is some divergence now due to my having upgraded the latter to comply with json2csv v. 1.12.1.)

dolsysmith commented 3 years ago

Rebased with master (incorporating changes from t128.

dolsysmith commented 3 years ago

Updated to include sorting of test data prior to testing.

lwrubel commented 3 years ago

Looks good! Is there an issue describing the need to update models.py and if not, would you create one?