Hi, thanks for your very interesting work for CONTEXT COMPRESSION!
I want to know which model is used in the paper to compute self-information.
For LLaMA family, this paper uses meta-llama/Llama-2-7b ? (https://huggingface.co/meta-llama/Llama-2-7b)
Another simple question, in this paper, we only compress the context/demonstration/document (and instruction if it exists), meanwhile not compress the question/query??? In other words, we only input the all contents from a prompt to compress except its question/query?
Hi, thanks for your very interesting work for CONTEXT COMPRESSION!
I want to know which model is used in the paper to compute self-information. For LLaMA family, this paper uses
meta-llama/Llama-2-7b
? (https://huggingface.co/meta-llama/Llama-2-7b)Another simple question, in this paper, we only compress the
context/demonstration/document
(andinstruction
if it exists), meanwhile not compress thequestion/query
??? In other words, we only input the all contents from a prompt to compress except its question/query?Thank YOU!