NEXT is a machine learning system that runs in the cloud and makes it easy to develop, evaluate, and apply active learning in the real-world. Ask better questions. Get better results. Faster. Automated.
One proposal: We have integration tests which consist of only JSON files for dictionaries to send to NEXT (say), that verify that all the plumbing is sound. From these, the initExp JSON can double as a thing used in the example launching scripts. We don't have to write all the e.g. processAnswer dictionaries separately--we could have the JSON for these be a template of some sort and have the test running code know how to interpret them sensibly.
We could then separately have functionality tests that can be more complicated and attempt to supply correct answers so that the resulting embeddings/rankings/whatever should look approximately correct.
We can also add load tests as desired, but it is useful to separate all these things (load tests, functionality tests, and integration tests). When developing, we want mostly to make sure quickly that we have not broken the plumbing. Before launching, we may want to take more time to verify that load will not break things and that the algorithms actually do the right thing.
One proposal: We have integration tests which consist of only JSON files for dictionaries to send to NEXT (say), that verify that all the plumbing is sound. From these, the initExp JSON can double as a thing used in the example launching scripts. We don't have to write all the e.g. processAnswer dictionaries separately--we could have the JSON for these be a template of some sort and have the test running code know how to interpret them sensibly.
We could then separately have functionality tests that can be more complicated and attempt to supply correct answers so that the resulting embeddings/rankings/whatever should look approximately correct.
We can also add load tests as desired, but it is useful to separate all these things (load tests, functionality tests, and integration tests). When developing, we want mostly to make sure quickly that we have not broken the plumbing. Before launching, we may want to take more time to verify that load will not break things and that the algorithms actually do the right thing.