on a M1 MacBook Pro CPU it does not seem to be possible to generate sequences longer than 32 tokens fast enough to keep up with real-time, at least with the trebles model.
alternating token buffers need to be implemented so that while one token sequence is being played through, another can be generated at the same time
on a M1 MacBook Pro CPU it does not seem to be possible to generate sequences longer than 32 tokens fast enough to keep up with real-time, at least with the
trebles
model. alternating token buffers need to be implemented so that while one token sequence is being played through, another can be generated at the same time