Closed sheraleks closed 3 years ago
I'm interested as well. Btw, can you share your cmd to turn off the internet search @sheraleks ?
I am working on adding incremental decoding, which should significantly reduce response time
I'm interested as well. Btw, can you share your cmd to turn off the internet search @sheraleks ?
@mailong25, yep. Just use this parameters: knowledge_access_method: memory_only, search_server: None
This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening.
Hi! I am stuck with BB2 400M (default parameters) huge response time. Even when internet search is off. Is there any way to speed up response generation? Maybe there is some parameters that I could change to speed up model? Or maybe I should perform knowledge distillation? I’ve seen examples for BB1, but not sure if it is possible for BB2.