marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList
MIT License
2.01k stars 204 forks source link

The distribution of each test type? #71

Closed eduOS closed 3 years ago

eduOS commented 3 years ago

Can I use this technique to build a test set to check the performance of a model? If so, how many cases should each test type take in the test sample?

marcotcr commented 3 years ago

You can certainly build a test set, but I prefer to think of individual tests rather than creating a test dataset. In terms of the process, let me point you to the paper. How many cases in each cell is up to you : )