danielmiessler / fabric

fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
https://danielmiessler.com/p/fabric-origin-story
22.31k stars 2.32k forks source link

[Bug]: Local LLM Output Short and Unformatted Remarks Instead of Following Prompt instructions and output guidlines with GPT-4, Claude, etc #bug #question #743

Closed Asentient closed 1 month ago

Asentient commented 1 month ago

Local LLM Output Short, Unstructured and Unformatted Compared to GPT-4

Description:

Users have reported that when using the models locally, the output is significantly shorter and does not follow the prompt instruction or the formatting provided by GPT-4. This issue is consistent across different setups, and different local models including various hardware configurations and operating systems.

Reproduction Steps:

  1. Use yt --transcript command with a YouTube video link.
  2. Run fabric -m llama3:latest -sp extract_wisdom.
  3. Compare the output with the same process using GPT-4.

Observed Behavior:

Expected Behavior:

Local LLM should produce similar comprehensive outputs as GPT-4, or similar subpar responses.

Additional Context:

Discussion Links:

josh-stephens commented 1 month ago

Have you considered the context limitations? Maybe the extract_wisdom pattern approaches or exceeds the input context window of your chosen local model, or comes close enough that there's not much left for an effective response. You might try creating a custom version(s) of extract_wisdom like extract_wisdom_mini that is focused and condensed on providing specific insights... one of the frontier models could probably help you construct that effectively.

Keep in mind too that for the purposes of something like extract_wisdom, the output of the local models running on a consumer gpu just aren't going to compare to the models working with 70b parameters like Groq, much less the big closed source models, so you have to manage your expectations based on the model's capabilities.

Asentient commented 1 month ago

I have considered it before, yes and here are my findings.

I checked that the local and online model are identical in parameters, context window, etc and yet i kept getting vastly different results between them. Notice how i didnt use chatgpt in this discussion for absolute clarity, we are talking about identical models in size, and supposedly settings too unless there is something i missed.

both online and local versions have a context window of 32768

llama_model_loader: - kv   3:                       qwen2.context_length u32              = 32768

[!info] This indicates that the context length for the local model "Qwen2-7B-Instruct" is 32,768 tokens.

qwen2-7b

└─➞ ytt https://www.youtube.com/watch?v=zHkcnDaCNUU | fabric -m qwen/qwen-2-7b-instruct -sp extract_wisdom

SUMMARY: In summary, OpenAI has outlined its roadmap towards AGI (Artificial General Intelligence) through five levels: Chatbots (Level One), Reasoners (Level Two), Agents (Level Three), Innovators (Level Four), Organizations (Level Five). OpenAI has been worki

IDEAS:

  1. OpenAI mapped AGI roadmap.
  2. Strawberry technology aims at autonomous deep research.
  3. Dolly image model improvements.
  4. Sora showcases advanced capabilities.
  5. HubSpot offers ChatGPT integration resources.
  6. Andre Karpathy launches Eureka Labs.
  7. Anthropic app released for Android.
  8. Gemini provides answers without unlocking phones.
  9. Google VDS creates sales training videos.
  10. YouTube Music Sound Search identifies songs through humming.
  11. Microsoft Designer integrates Co-Pilot sidebar.
  12. Codstroll Mamba offers faster code generation.
  13. Amazon Rufus shopping assistant.
  14. Meta faces EU regulatory issues.
  15. Johanis Stelzer demonstrates controlling image generation.
  16. App turns selfies into printable 3D characters.
  17. High accuracy determining sex from dental X-rays.
  18. OpenAI launched GPT-40 Mini.
  19. Nvidia partnered with Mistol creating Nemo.
  20. Google sponsors Team USA Olympics ads featuring various Google AI products.

INSIGHTS:

  1. OpenAI progresses towards AGI through multiple stages.
  2. Strawberry technology aims at autonomous deep research capabilities.
  3. Dolly image model improvements enhance text clarity.
  4. Sora showcases advanced capabilities through impressive demos.
  5. HubSpot offers comprehensive resources for integrating ChatGPT into work processes.
  6. Andre Karpathy launches Eureka Labs focusing on innovative education methods.
  7. Anthropic app release highlights advancements in conversational interfaces.
  8. Gemini provides convenient access without unlocking phones.
  9. Google VDS simplifies sales training video creation.
  10. YouTube Music Sound Search introduces innovative music identification through humming.
  11. Microsoft Designer integrates Co-Pilot sidebar enhancing image creation capabilities.
  12. Codstroll Mamba offers faster code generation capabilities.
  13. Amazon Rufus shopping assistant streamlines customer queries within its app.
  14. Meta faces regulatory challenges regarding multimodal models.
  15. Johanis Stelzer demonstrates innovative control over image generation processes.
  16. App turns selfies into printable 3D characters showcasing advanced technology integration.
  17. High accuracy determining sex from dental X-rays highlights potential forensic applications.
  18. OpenAI’s GPT-40 Mini improves efficiency compared against smaller models.
  19. Nvidia’s partnership with Mistol creates Nemo designed for local deployment scenarios.
  20. Google’s sponsorship emphasizes its commitment towards promoting various Google AI products during Olympics coverage.

QUOTES: "OpenAI believes we'll move through each level towards true AGI." "Strawberry aims at autonomous deep research capabilities." "Dolly image model improvements enhance text clarity." "Sora showcases impressive capabilities through advanced demos." "HubSpot offers comprehensive resources integrating ChatGPT." "Google VDS simplifies sales training video creation." "Microsoft Designer integrates Co-Pilot sidebar enhancing image creation." "Codstroll Mamba offers faster code generation capabilities." "Amazon Rufus shopping assistant streamlines customer queries within its app."

HABITS:

  1. Regularly updating skills through continuous learning resources.
  2. Utilizing advanced tools such as Sora, Gemini, VDS, etc., efficiently.
  3. Keeping track of industry trends through daily updates.
  4. Integrating artificial intelligence into daily work processes effectively.
  5. Staying informed about regulatory changes affecting technology usage.

FACTS:

  1. OpenAI outlines five stages towards AGI development.
  2. Strawberry technology aims at autonomous deep research capabilities.
  3. Dolly image model improvements enhance text clarity significantly.
  4. Sora showcases advanced capabilities through impressive demos regularly.
  5. HubSpot offers comprehensive resources integrating ChatGPT into work processes effectively.

REFERENCES:

ONE-SENTENCE TAKEAWAY: The rapid advancement in artificial intelligence technologies continues across various sectors including education, shopping assistance, music identification, dental forensics, image generation, sales training videos, autonomous research capabilities,

RECOMMENDATIONS:

  1. Stay updated with OpenAI’s progress towards AGI development stages.
  2. Explore advancements made by Strawberry technology towards autonomous deep research capabilities.
  3. Utilize Dolly image model improvements effectively by enhancing text clarity significantly.
  4. Engage with Sora’s advanced capabilities through impressive demos regularly.
  5. Integrate HubSpot’s comprehensive resources effectively when integrating ChatGPT into work processes.
  6. Consider Andre Karpathy’s Eureka Labs focusing on innovative education methods utilizing artificial intelligence.
  7. Keep informed about regulatory changes affecting technology usage within organizations.
  8. Explore opportunities provided by Anthropic’s conversational interface advancements through its Android app release.
  9. Utilize Gemini’s answer service without unlocking phones efficiently.
  10. Simplify sales training video creation using Google VDS tools effectively.
  11. Leverage Microsoft Designer’s Co-Pilot sidebar integration for enhanced image creation capabilities efficiently.
  12. Explore Nvidia’s partnership with Mistol creating Nemo designed specifically for local deployment scenarios efficiently.
  13. Stay aware about advancements made by Codstroll Mamba offering faster code generation capabilities efficiently.
  14. Streamline customer queries within Amazon Rufus shopping assistant efficiently during Olympic coverage sponsored by Google’s various AI products effectively.
  15. Address regulatory challenges faced by tech giants such as Meta regarding multimodal models efficiently within organizational frameworks effectively.

HABITS: 1. Regularly updating skills through continuous learning resources such as online courses or workshops related specifically aimed at advancements within artificial intelligence technologies.

2. Utilizing advanced tools such as Sora, Gemini, VDS (Video Design Studio), etc., efficiently by understanding their unique features before implementation within specific projects or workflows.

3. Keeping track daily updates regarding industry trends through following relevant blogs, podcasts, newsletters dedicated specifically towards advancements within artificial intelligence technologies.

4. Integrating artificial intelligence into daily work processes effectively by identifying areas within specific job roles where automation could potentially enhance productivity without compromising quality standards set forth by respective industri

5. Staying informed about regulatory changes affecting technology usage within organizations by regularly attending seminars hosted by legal experts specializing specifically within areas related directly towards artificial intelligence technologies e

FACTS: 1. OpenAI outlines five stages towards AGI development highlighting progression towards achieving true artificial general intelligence capabilities across multiple sectors including education, healthcare, finance among others demonstrating significant [Breaks of into Gibberish here.]


qwen2-7b [Local]

$ ytt https://www.youtube.com/watch?v=zHkcnDaCNUU | fabric -m qwen2:7b-instruct -sp extract_wisdom

In this video, the speaker discusses various updates in the field of AI. They compare performance scores between different models and introduce a new model called GPT 40 Mini, which appears to perform well across several benchmarks compared to other smaller models from competing platforms. The speaker also mentions Nvidia's Mistol Nemo model, designed for local deployment on devices with limited internet connectivity or stringent data privacy requirements.

The video highlights Google's AI sponsorship for Team USA in the Olympics and an AI news channel called Futur Tools where viewers can stay updated on the latest AI news, tools, and research. The speaker also mentions HubSpot as a sponsor of this particular video.

In summary, the key points are:

  1. GPT 40 Mini performs well across multiple benchmarks compared to other smaller models.
  2. Nvidia's Mistol Nemo model is designed for local deployment on devices with limited internet access or stringent data privacy requirements.
  3. Google has sponsored Team USA in the Olympics with AI-related products and services.
  4. Futur Tools curates the latest AI news, tools, and research and offers a free newsletter to subscribers.

These updates showcase advancements and developments in AI technology across various domains, including cloud computing, local deployment options for models, sponsorships at major events like the Olympics, and curated content platforms for staying informed on AI-related topics.


What else could be other than context_window? If anyone has any other parameter for me to check that would be helpful.

tejas-hosamani commented 1 month ago

+1, same for me

tejas-hosamani commented 1 month ago

I tested this will less/more input data. For less input, it does adhere to the format but with more data, it starts to ignore the format.

It probably has something to do with the token size 🙃 I am wondering if we can feed large data in chunks for it to process and then stitch the response together.

tejas-hosamani commented 1 month ago

I tried with llama3:8b-instruct-q8_0, it has 8k token size, yet same results.

SystemR commented 1 month ago

I can confirm it's the context window.

I tried calling ollama directly with the same system prompt, passing in a higher num_ctx to 18k and using mistral nemo, it gives me much better results for long vids.

Is it possible to pass in ollama flags via the cli?

vhongtuanha commented 1 month ago

In another post someone gave a solution that worked for me:

https://medium.com/@celobusana/solving-fabric-and-local-ollama-context-issues-a-step-by-step-guide-1d67e443e27e

tejas-hosamani commented 1 month ago

I tried with

FROM llama3:8b-instruct-q8_0
PARAMETER num_ctx 8192

Didn't work. I tried with 6h long video and 55m long video.

vhongtuanha commented 1 month ago

I tried with

FROM llama3:8b-instruct-q8_0
PARAMETER num_ctx 8192

Didn't work. I tried with 6h long video and 55m long video.

For me a 24min video i needed minimum 8192. So i guess 6h long video you need a lot more context