Support for compression

Yeah I was previously rejecting this but I'm open for ideas in how to approach this and of course would welcome contributions. The tricky bit here is to have something that makes sense for the "fork" and "reduce".

First, it can make a big difference in how you split something up. E.g. when summarizing an article or interviews there are certain places in the text that are better than others, e.g. after a paragraph ends, before a new chapter/subsection starts, after a reply to a question etc. This depends on the kind of text and to get the best result should be customizable. So for example allowing for some kind of function that can indicate positions inside of a text to trim.

Then, do you want a serial/rolling prompt or parallel prompting? With the former it is possible to e.g. embed a reply ("carry over") from the previous prompt which can e.g. be useful for summaries.

Then the final processing step. The simplest solution is to just concatenate the outputs but e.g. for a parallel summarization strategy you would want all the outputs combined in a last prompt to "summarize the summaries" etc.

So my concern here is the following: Providing support and an abstraction for all this is a lot of work (and would add complexities on its own) while at the same time that support might not be general enough to be useful for a particular use cases. Which would then make you go and just pull out an SDK in your favorite language and script your own... So I would like some ideas in how an implementation would be useful before starting work on this.

rksm / org-ai

Support for compression #82