Closed mdocekal closed 6 months ago
I'm about to write an issue on their GitHub.
We implemented alternate version of truncation (truncating instruction instead of few-shot samples). This doesn't seem to be helpful on propaganda datasets (7/13 tasks got worse result for CSMPT-100k). So we keep the original truncation for now.
However this might be still an issue for instruction tuned models. Only then we might consider pull requesting this.