Open AnnaBeers opened 6 years ago
An update on the testing front. I have created a set of test dataset for DeepNeuro modules, with example datasets and known outcomes, hosted on a private server with GPUs. These tests are currently run when-everly, but the next step is to automate them.
Ideally, our continuous integration platform would either send a command to this server and receive a response from those tests after completion, but that might take valuable GPU time away from others on that server. In general, having CI for GPU-intensive tasks seems to be a not-totally-solved problem in the package maintenance world yet..
For others looking to take up this task, I suggest looking up pytest and working on some of the more basic issues listed in the initial comment.
Frequently, when I change something in DeepNeuro, I break something somewhere else in the code. Fortunately, we can run tests to avoid this! Below is a list of basic tests we should add. Some things will be difficult to test, such as the use of computation-heavy preprocessing steps, GPU models, or external libraries, but this should at least make sure things work on some basic level.
-- Loading one file, and multiple files (2-3) into a data collection. -- Saving those files out to an hdf5 file. -- Preprocessing data with a dummy preprocessor (maybe can wait, since preprocessors are not heavily used outside of inference yet) -- Augmenting data with a dummy augmentation (1x, 2x), multiple dummy augmentations. -- Training a minimal model locally (may run into keras/tensorflow installation issues on Travis) -- Loading a minimal model locally -- Running 1, 2-3 inferences on a minimal model locally. Stream from both hdf5 and file names.
I'll assign myself for now -- post here if you have more test ideas.