Closed feifeibear closed 6 months ago
sync version (AsyncLongContextAttention set self._async_op = False)
23.500 iter/sec
async without stream (AsyncLongContextAttention set self._async_op = True, stream is None)
23.194 iter/sec
async with stream (AsyncLongContextAttention set self._async_op = True, stream not None)
23.223 iters/sec
LongContextAttention 26.255 iter/sec
sync version (AsyncLongContextAttention set self._async_op = False)
23.500 iter/sec
async without stream (AsyncLongContextAttention set self._async_op = True, stream is None)
23.194 iter/sec
async with stream (AsyncLongContextAttention set self._async_op = True, stream not None)
23.223 iters/sec
LongContextAttention 26.255 iter/sec