[Question]: When using local LLM only 100 tokens are generated.

foxbg commented 5 months ago

What is your question?

When I run with local LLM fabric requests only 100 tokens from the model. Generating (100 / 100 tokens)

Is there an option to set that I'm missing?

danielmiessler commented 5 months ago

I think that's an Ollama function. I'd look into how to control from there and let us know.

foxbg commented 5 months ago

In my case I'm using koboldcpp. On LMM cli I have an option to set context size, But tokens generated is usually coming from the application. for example "max_length". If it can help OpenAI API uses 'max_tokens'

foxbg commented 5 months ago

For now I have good results using llama.cpp even it is a bit slower.

GabrielLanghans commented 3 months ago

I'm also experiencing issues, probably due to the context size, since I'm using ollama with local models. I tried with llama3, phi3: mini and mistral, and all of them with similar results.

I'm not able to reproduce the great output showed in the documentation or the videos. The responses/outputs I'm getting are not formatted the way the pattern are strucured.

Example: With the following yt transcipt extract_wisdom pattern: yt --transcript https://youtube.com/watch?v=uXs-zPc63kM | fabric --model llama3:latest --stream --pattern extract_wisdom

I'm getting the following output: This text appears to be an episode summary or transcript from "The Huberman Lab Podcast" on the topic of nicotine and its effects on psychology and physiology.

The podcast discusses the biology of nicotine, including how it interacts with dopamine and other neurotransmitters in the brain. The hosts also explore the psychological effects of nicotine addiction, including the role of hypnosis as a potential treatment method for quitting smoking or vaping.

Throughout the episode, the hosts provide scientific explanations and examples to illustrate their points. They also offer advice and tools for listeners who want to quit using nicotine products, such as developing a protocol for dealing with cravings and incorporating hypnosis into one's daily routine.

The podcast also touches on broader topics, including the importance of understanding the underlying biology and psychology of nicotine addiction in order to develop effective treatment strategies. The hosts encourage listeners to support the podcast by subscribing to their YouTube channel, leaving reviews on Apple and Spotify, and checking out their sponsor Momentous Supplements.

Overall, this episode summary suggests that "The Huberman Lab Podcast" is a science-focused podcast that aims to educate its audience about the latest research in areas such as neuroscience, psychology, and pharmacology.

What I'm missing here?

VVincentt commented 2 months ago

I have the problem as @GabrielLanghans above. No matter what pattern I choose, I get a summary of the article or of the transcript I feed to fabric. I have update fabric by following the README instructions. I use Ollama to run llama3:8b-instruct-q8_0.

Asentient commented 2 months ago

I third that. same issue here.

danielmiessler / fabric

[Question]: When using local LLM only 100 tokens are generated. #262

What is your question?