Open domenicrosati opened 3 months ago
I think another thing that is very valuable is somehow develop metrics on the fluency of the method. LM Harness isn't really great at understanding the impact of fluecny and impact of a method on long form generation.
This includes:
ToDo: