microsoft / TaskWeaver

A code-first agent framework for seamlessly planning and executing data analytics tasks.
https://microsoft.github.io/TaskWeaver/
MIT License
5.26k stars 663 forks source link

Running Ollama with LLama3 and Phi3 #341

Open arhaang13 opened 5 months ago

arhaang13 commented 5 months ago

Hello,

I wanted to open the issue when using taskweaver with Ollama, run on the local machine, none of the models provided within Ollama are functional.

The way I configured the taskweaver_config.json is:

Screenshot 2024-05-13 at 10 05 37 AM

When I try running Ollama with phi3, the output that I get is:

Screenshot 2024-05-13 at 10 08 18 AM

The same issue occurs when configuring taksweaver with Llama3 in the same way.

I hope to hear back!

Best, Arhaan

liqul commented 5 months ago

The problem is not with Ollama, but the capability of the model served with Ollama. The model failed in following the instructions in the prompt to generate a response in the correct format. Phi3 is typically too small to generate a right response.

raviteja-bhupatiraju commented 4 months ago

The problem is not with Ollama, but the capability of the model served with Ollama.

No, the problem is with Ollama. I never had a single model work right with Ollama and TaskWeaver (across 4 machines - Win 10/11 and Linux). LM Studio Server works fine.

liqul commented 4 months ago

Good to know.

jacky-ch93 commented 3 months ago

The problem is not with Ollama, but the capability of the model served with Ollama. The model failed in following the instructions in the prompt to generate a response in the correct format. Phi3 is typically too small to generate a right response.

The problem is not with Ollama, but the capability of the model served with Ollama.

No, the problem is with Ollama. I never had a single model work right with Ollama and TaskWeaver (across 4 machines - Win 10/11 and Linux). LM Studio Server works fine.

The problem is not with Ollama, but the capability of the model served with Ollama.

No, the problem is with Ollama. I never had a single model work right with Ollama and TaskWeaver (across 4 machines - Win 10/11 and Linux). LM Studio Server works fine.

@raviteja-bhupatiraju use meta-llama3model with Ollama, and success to get the expected outcome. But when I change the model from meta-llama3 to llama3-chinese, it failed.

jacky-ch93 commented 3 months ago

The problem is not with Ollama, but the capability of the model served with Ollama. The model failed in following the instructions in the prompt to generate a response in the correct format. Phi3 is typically too small to generate a right response.

@liqul agree with this, but I do not clear which model is better to generate a response in the correct format. I try to use GLM-4, meta-llama3, llama3-chinese, but each one get the different result. Do you have any suggestion about how to choose a better model for taskweaver? Or what should you pay attention to when using the model? Thanks a lot!

liqul commented 3 months ago

The problem is not with Ollama, but the capability of the model served with Ollama. The model failed in following the instructions in the prompt to generate a response in the correct format. Phi3 is typically too small to generate a right response.

@liqul agree with this, but I do not clear which model is better to generate a response in the correct format. I try to use GLM-4, meta-llama3, llama3-chinese, but each one get the different result. Do you have any suggestion about how to choose a better model for taskweaver? Or what should you pay attention to when using the model? Thanks a lot!

Thanks for sharing information on your experiments. I personally tested a few models under 10B (don't want to list all of them) and found that meta-llama3 works generally the best among all the others. However, it sometimes still fails to follow the correct format. We are recently working on a branch to better support for not-that-large language models and it is still on going. The basic idea is to leverage constrained generation to enfore the model generating its output exactly following a certain schema. Check out some popular projects here:

It is still not finished yet.

jacky-ch93 commented 3 months ago

The problem is not with Ollama, but the capability of the model served with Ollama. The model failed in following the instructions in the prompt to generate a response in the correct format. Phi3 is typically too small to generate a right response.

@liqul agree with this, but I do not clear which model is better to generate a response in the correct format. I try to use GLM-4, meta-llama3, llama3-chinese, but each one get the different result. Do you have any suggestion about how to choose a better model for taskweaver? Or what should you pay attention to when using the model? Thanks a lot!

Thanks for sharing information on your experiments. I personally tested a few models under 10B (don't want to list all of them) and found that meta-llama3 works generally the best among all the others. However, it sometimes still fails to follow the correct format. We are recently working on a branch to better support for not-that-large language models and it is still on going. The basic idea is to leverage constrained generation to enfore the model generating its output exactly following a certain schema. Check out some popular projects here:

It is still not finished yet.

Thank you very much for your quick reply and your suggestions! I'm sorry for not expressing myself clearly. The content that needs to be supplemented is that the model I tested are all lager than 10B, including meta-llam3-70B, llama3-chinese-70B. The problem I mentioned above is using those models. meta-llam3-8B failed in my test all the time. Looking forward to your new release!