ollama / ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
https://ollama.com
MIT License
93.43k stars 7.38k forks source link

Support for GrabbeAI #7147

Open finnbusse opened 6 days ago

finnbusse commented 6 days ago

GrabbeAI is an AI model trained by 10th grade students of the german grammar school Grabbe-Gymnasium Detmold and helps users - especially students and teachers - with their (home)work! From what I know, it is the first LLM made entirely by students! Since I personally trained the model with some other students with the (financial) assistance from some teachers, you can find the model both on Hugging Face and Ollama Model Hub. The Hugging Face URL is: https://huggingface.co/grabbe-gymnasium-detmold/grabbe-ai The Ollama hub URL is: https://ollama.com/grabbe-gymnasium/grabbe-ai

On Hugging Face, the model has about 9 million downloads, on Ollama hub it has more then 300, so I think it could be relevant for a huge group of people! We trained the model based on real class test tasks and good homeworks and it works quiet fine. We are actively updating our LLM on Hugging Face and plan to publish a new build until December. This new release will double the trained dataset!

rick-github commented 6 days ago

The link to the dataset from the model card is broken - trailing %5D.

finnbusse commented 6 days ago

@rick-github It works for me...

rick-github commented 6 days ago

Screenshot 2024-10-09 13 56 47

finnbusse commented 6 days ago

@rick-github I don't know why this is there... but this URL is working: https://huggingface.co/datasets/grabbe-gymnasium-detmold/grabbeai grafik

rick-github commented 6 days ago

This link, not the one on the right hand side.

Screenshot 2024-10-09 14 03 59

finnbusse commented 6 days ago

Oh this is the same link and the same dataset. But I will try to fix it!

Edit: @rick-github I fixed the problem: it was just the square bracket which added the %5D

rick-github commented 6 days ago

I see you also finetuned qwen2.5 and llama3.1. Did you find substantial differences between the different models?

finnbusse commented 6 days ago

Yes. Those models are experiencing much halluscination, I think this is because of the low amout of parameters the base model was trained with. I am currently working on training the Mistral models. If those give better results, we might change this in the future. But the main differences between qwen2.5 and llama3.1 I noticed was qwen2-5's lower accuracy for other languages than English and Chinese. Llama 3.1 did this quiet well. But both llama 3.1 and qwen2.5 have a huge difference to what gemma2 or the Mistral models (v0.3 7b and Mistral Nemo) offer.