Why do you need to separate the last batch of the output

nvtransfer / RULER

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Apache License 2.0

646 stars 43 forks source link

Why do you need to separate the last batch of the output #21

Closed vkaul11 closed 3 months ago

vkaul11 commented 4 months ago

https://github.com/hsiehjackson/RULER/blob/main/scripts/pred/call_api.py#L270 Why do you need to separate the last batch of output here and why do you need the threads above?

hsiehjackson commented 4 months ago

why do you need the threads above?

We can send multiple requests to our server in parallel. In server side, we may deal with multiple requests at the same time depends on your framework.

why do you need to separate the last batch of output here?

We may still remain some samples because we have this condition to join our threads.