marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList
MIT License
2.01k stars 204 forks source link

Reproducibility Issues When Templating #152

Open ChatBotMatt opened 1 year ago

ChatBotMatt commented 1 year ago

Because template values are in part based on the order of the keys (e.g. {city1}, {city2} vs {city2}, {city1} will output two different texts even given the same sets of options in the same order), and the order of keys is both undefined and non-seedable (since keys is a set generated by find_all_keys, it's impossible to get proper reproducibility when templating texts. This is quite inconvenient for proper unit testing, reproducibility of experiments, etc.

Something simple like calling sorted on the return values of find_all_keys in Editor.template can fix this, and it'd be nice if it was at least a parameter that could be set.