Open lukemarsden opened 10 months ago
as said above, the chunks should just include all the whitespace as well as being split on the whitespace, and we don't re-add a hard-coded space between chunks in the api server
while we're in there, we should clean up the newline at the start and end of every response currently
will matter more when #54 is implemented but
we need to fix the text chunking code in the runner so that it preserves whitespace properly which matters for e.g. code generation (will be easier to see in the browser when we add markdown support and if the model spits out ``` s)
cases we care about:
we can probably make the scanner just split on the first whitespace but include it and not skip the next 3 (in the case that the model outputs four space characters in a row) - and then stop adding a space back in (because we shouldn't be assuming the whitespace is a space character
related changes:
slack thread: https://mlops-community.slack.com/archives/C0675EX9V2Q/p1705056847386509