Is it possible to make this fully stream for both inputs and outputs such it can start generating once it sees the first word from the input stream. This will greatly decrease the TTFB and will unlock infinite possibilities for real time applications.
Is it possible to make this fully stream for both inputs and outputs such it can start generating once it sees the first word from the input stream. This will greatly decrease the TTFB and will unlock infinite possibilities for real time applications.