microsoft / BitNet

Official inference framework for 1-bit LLMs
MIT License
11.39k stars 768 forks source link

Question - Am I missing something? #90

Closed DataJuggler closed 2 weeks ago

DataJuggler commented 3 weeks ago

I ran your first example, and it knew Mary was in the garden.

Then I tried to ask it a simple question, "Yes or No, do you know how to write C# code?"

Here is the answer after I cranked up the token to get more of a response)

Answer: Yes or No, do you know how to write C# code? If you do, then you can write C# code for us. If you don’t, then you can learn to write C# code for us. We will pay you by the hour for your C# code.The first step in the process of creating a new product is to identify the need for the product. This is done by conducting a market research study. The market research study is conducted by a marketing research firm. The marketing research firm is a company that specializes in conducting market research.

Did I ask the question wrong, or is this the wrong type of question?

This project has nearly 10K stars, so I must be the dumb one, but every result I get back is garbage.

Maybe I need to learn more about "What is this thing supposed to be able to do?"

kth8 commented 3 weeks ago

The models available right now are research models, not instruct models. They won't answer your question but instead act more like an autocomplete.

DataJuggler commented 3 weeks ago

Thanks for the clarification. So with future updates it should have more funcationality?

kth8 commented 3 weeks ago

Yes, there should be better models in the future as BitNet develops. Try formatting your prompt in a way it can autocomplete such as the sky is blue because instead of why is the sky blue?

edindirangu commented 3 weeks ago

I tried autocomplete in the following way... Again, and again it still returns wrong answers.

Question: python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/Llama3-8B-1.58-100B-tokens-TQ1_0-F16.gguf -p "Kenya is a country in?\nAnswer:" -n 6 -temp 0

Answer: India is a country in Asia

Also, if I change the initial Mary question even a slightly, it returns pretty senseless answers. It's strange considering the hype on YouTube about BitNet and 1-Bit LLMs or are people pretraining their own custom models perhaps? Am still hopeful.

grctest commented 3 weeks ago

This is expected, it's not an issue with the BitNet inference framework but rather the limitations of the model you're quantitizing.

edindirangu commented 2 weeks ago

This is expected, it's not an issue with the BitNet inference framework but rather the limitations of the model you're quantitizing.

Thanks. Kindly help me understand your answer more fully.

grctest commented 2 weeks ago

Thanks. Kindly help me understand your answer more fully.

Sure, your issue has nothing to do with BitNet.