dsgt-birdclef / birdclef-2022

Code for the BirdCLEF 2022 competition by the DS@GT team
2 stars 3 forks source link

Add github actions to run pre-commit hooks and run tests #20

Closed acmiyaguchi closed 2 years ago

acmiyaguchi commented 2 years ago

Adding basic CI is useful for ensuring the code quality is up to par.

bran22 commented 2 years ago

Kinda got started with this in PR #21 I ran into an issue though with the action runner hanging on the second test: tests/test_model_classifier_datasets.py::test_resample_dataset_correct_shape. It can run for 6hrs and time out with that test not completing at all (even though that test completes locally for me in < 5 mins)

Not sure if it has to do with the multiprocessing.Pool, disk space consumption, or something else. I saw this issue (https://bugs.python.org/issue38501) which made me try out some different python versions (3.7, 3.8, 3.9) to no avail. I also looked into the potential we were running out of disk space, but this page (https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners) states that we get 14GB in a windows host environment, which should be enough. (The biggest package by far is pytorch at about 2.4GB, but the ogg files generated by the test don't seem to take up very much space at all).

I also tried skipping the offending test, but it still started to hang after tests/test_model_classifier_datasets.py::test_resample_dataset_correct_shape was skipped.

image