A central, open resource for data and tools related to chain-of-thought reasoning in large language models. Developed @ Samwald research group: https://samwald.info/
Adjusted the evaluation function for outputs of the nightly model of Cohere. General improvements of evaluation function.
Now doing more simple checks before going into regex comparisons.
Adjusted the evaluation function for outputs of the nightly model of Cohere. General improvements of evaluation function. Now doing more simple checks before going into regex comparisons.