Open MarinaWyss opened 1 year ago
What do you mean by "using huggingface versions"? You mean having the inference done on their side? I use these models by cloning their repo, and they work just fine, just like it seems you do, too.
If I download FastChat and the model weights directly and use the CLI to interact with the model (the same one here), the results are vastly superior.
Basically, when I use an EC2 instance with the exact same specs to run the downloaded weights via the CLI I get good results, but when I use that same instance in a SageMaker notebook with this code the results are not nearly as good.
I'm just curious about why that would be.
Hi, I have a very basic question:
I would like to download Vicuna from HuggingFace (e.g. this model) and use it to ask arbitrary questions, like "What topics are discussed in this text?" or "Summarize what happened in this text." If I try this in the GUI I get reasonable answers to all of my questions. I have also tried downloading the weights via FastChat and that works as expected, too.
But, when I try the same thing using HuggingFace versions, the quality is much worse. Rather than respond to the questions as I would expect, it typically just outputs the input text and some additional generated text, which is not what I'm after.
Here is some example code:
If anyone can point me in the right direction/offer advice I would be super grateful. I feel like I must be missing something obvious. Thank you!