Added in test to check dataset download. The tests will be skipped ( only will run if --run_data_download is passed to the CLI). This test will take a lot of time so I'm sure this is not something we would want to do with any frequency. It might be worth creating separate tests (instead of parameterize) to make it easier to just specifically test a given dataset.
It might also be better to make this NOT a pytest test and just a separate piece of code that gives more options to check/test very specific datasets and versions and give more descriptive outputs. Having something like that in the dataset module, where people can more easily query available datasets and versions.
That PR has high priority; the CI tests for all other PRs are failing since some of the hashes for the datasets have changed in the remote location. @chrisiacovella , is this PR ready for review?
Added in test to check dataset download. The tests will be skipped ( only will run if --run_data_download is passed to the CLI). This test will take a lot of time so I'm sure this is not something we would want to do with any frequency. It might be worth creating separate tests (instead of parameterize) to make it easier to just specifically test a given dataset.
It might also be better to make this NOT a pytest test and just a separate piece of code that gives more options to check/test very specific datasets and versions and give more descriptive outputs. Having something like that in the dataset module, where people can more easily query available datasets and versions.
Status