Closed ZipECHO closed 7 months ago
We will release the nodes with the mode tag input after the generation of the current query. We assume that the benefit of the current query's prompt to the next query is less than the gradually increasing time cost it incurs.
Thank you for your reply. Do you means that you add nodes of the current query prompt before inference in add? Then you remove these node in remove1 or remove2 after the inference done. I am not sure which one corresponds to the release process. Beside, could you explain the function of stream_put of lookahead_cache? Thanks!
Q1: Yes.
Q2: We remove the nodes of prompts here and here.
Q3: stream_put
is used for putting generated tokens into lookahead_cache as soon as possible rather than the final step(i.e., the put
function). We use a buffer in stream_put
to accumulate tokens to the length of decoding_length
to avoid breaking a branch. stream_put
is better than put
when a response contains repeated token pieces.
Thank you very much~, I am clearly understand this part now.
Hi guys, I have another questions about Trie tree maintenance.
Why there are two kind of trees _update_trees and _update_input_trees,and what are the funtions difference between them?
I guess you add prompts into _update_input_trees
and release it when finished an inference of a prompt. And the inference results will be add into _update_trees
and this set will be squeezed and released after its length exceed 1024 in here, do I understand correctly?
Besides, will these release actions will affect (remove nodes or update freqs) the mem
? or just affect _update_trees
and _update_input_trees
?
Thank you very much!
Be free to ask any quetions.
_update_trees
is used to squeeze overfat trees, while _update_input_trees
is used to reset frequency of input(prompt) tokens. _update_trees
. We squeeze trees until the size of 1024 for better performance, as a tree updated by several times under the size 1024 will only squeeze and count by one time.mem
.Thanks~
Hi, I have noticed that there are also two mode(input and output) in Tree class. Could you please explain why these two modes need to be set, and what operations they correspond to on the Trie tree?