I would like to use the JSON mode for Mistral 7B while doing offline inference using the generate method as below. Is that possible somehow? Just using the prompt doesn't seem to produce JSON output as requested. If this is not possible, is the only solution to use something like Outlines? Would love some details on that, if so.
llm = LLM(model="mistralai/Mistral-7B-v0.3")
prompt = "Please name the biggest and smallest continent in JSON using the following schema: {biggest: <the biggest continent's name>, smallest: <the smallest continent>}"
sampling_params = SamplingParams(temperature=temperature, top_p=1.0)
response = self.llm.generate(prompt, sampling_params)
Your current environment
How would you like to use vllm
I would like to use the JSON mode for Mistral 7B while doing offline inference using the
generate
method as below. Is that possible somehow? Just using the prompt doesn't seem to produce JSON output as requested. If this is not possible, is the only solution to use something like Outlines? Would love some details on that, if so.