Mistype issue using MLX model via MLXPipeline

jaypif commented 6 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

The following code

from langchain_community.llms import OpenAI, Ollama, MLXPipeline

model, tokenizer = load("mlx-community/dolphin-2.8-mistral-7b-v02")
self.mlx = MLXPipeline(
    model=model,
    tokenizer=tokenizer,
    pipeline_kwargs={"temp":0.7, "max_tokens":10}
)

on the following prompt

Collect and summarize recent news articles, press releases, and market analyses related to the company. Pay special attention to any significant events, market, sentiments, and analysts' opinions.
Your final answer MUST be a report that includes a comprehensive identification of key points makrting oriented following the Marketing 5Ps (Product, Place, Price, Promotion, People)
If you do your BEST WORK, I will give you a $10,000 commission!
Make sure to use the most recent data as possible.
Selected company by the customer is Tesla

Lead to an error during execution.

Error Message and Stack Trace (if applicable)

  File "/opt/homebrew/lib/python3.10/site-packages/langchain_community/llms/mlx_pipeline.py", line 189, in _stream
    text = self.tokenizer.decode(token.item())
AttributeError: 'int' object has no attribute 'item'

Description

Hi

I am trying to use Langchain for loading a MLX model (cf code) on a given prompt.
I face the error available in the error section: AttributeError: 'int' object has no attribute 'item'

Removing the .items() on the line 182 unlock the issue however I have nothing as result.

So my idea is not correct.

The file libs/community/langchain_community/llms/mlx_pipeline.py has been added last week so it is very new. Could you take a look @Blaizzy ?

Thank you

System Info

here is the version I use:

Python 3.10

pip freeze | grep langchain
langchain==0.1.12
langchain-community==0.0.33
langchain-core==0.1.43
langchain-openai==0.0.5
langchain-text-splitters==0.0.1

Blaizzy commented 6 months ago

On it :)

Blaizzy commented 6 months ago

Can you try instatiating the model from ID and let me know if the problem persists:

llm = MLXPipeline.from_model_id(
    "mlx-community/dolphin-2.8-mistral-7b-v02",
    pipeline_kwargs={"max_tokens": 10, "temp": 0.1},
)

jaypif commented 6 months ago

Same issue

Blaizzy commented 6 months ago

Could you share the whole example.

jaypif commented 6 months ago

I use it inside a crewAI code.

When I run the agents on Ollama ou GPT4 model I have no problem.

Here is the crew.py file

from crewai import Agent, Task, Crew, Process
from crewai.project import CrewBase, agent, crew, task
from langchain_openai import ChatOpenAI
from langchain_community.llms import OpenAI, Ollama, MLXPipeline
from decouple import config
from tools.searchNewsDB import *
from crewai_tools import ScrapeWebsiteTool, SerperDevTool
from mlx_lm import load

from textwrap import dedent

@CrewBase
class MarketingAnalystCrew():
    """MarketingAnalystCrew crew"""
    agents_config = 'config/agents.yaml'
    tasks_config = 'config/tasks.yaml'

    def __init__(self) -> None:
        self.OpenAIGPT4 = ChatOpenAI(model_name="gpt-4", temperature=0.7)
        self.Ollama = Ollama(model="dolphin-mixtral:8x7b-v2.6-q3_K_L")
        #model, tokenizer = load("mlx-community/dolphin-2.8-mistral-7b-v02")
        #self.mlx = MLXPipeline(
        #            model=model,
        #            tokenizer=tokenizer,
        #            pipeline_kwargs={"temp":0.7, "max_tokens":10}
        #        )
        self.mlx = MLXPipeline.from_model_id(
                    "mlx-community/dolphin-2.8-mistral-7b-v02",
                    pipeline_kwargs={"max_tokens": 10, "temp": 0.1}
                )

    @agent
    def marketing_researcher(self) -> Agent:
        return Agent(
            config = self.agents_config['research_analyst'],
            tools = [SearchNewsDB().news, SearchWebDB.SerperWebTool()],
            llm = self.mlx
        )

    @agent
    def marketing_analyst(self) -> Agent:
        return Agent(
            config = self.agents_config['marketing_analyst'],
            tools = [GetNews.news, SerperDevTool(), ScrapeWebsiteTool()],
            llm = self.mlx
            #llm = self.OpenAIGPT4
        )

    @agent
    def translator(self) -> Agent:
        return Agent(
            config = self.agents_config['translator'],
            llm = self.mlx
            #llm = self.OpenAIGPT4
        )

    @task
    def research_task(self) -> Task:
        return Task(
            config = self.tasks_config['research_company_task'],
            agent = self.marketing_researcher()
        )

    @task
    def analysis_task(self) -> Task:
        return Task(
            config = self.tasks_config['marketing_analysis_task'],
            agent = self.marketing_analyst(),
            context = [self.research_task()]
        )

    @task
    def translate_task(self) -> Task:
        return Task(
            config = self.tasks_config['translator_task'],
            agent = self.translator(),
            context = [self.analysis_task()]
        )

    @crew
    def crew(self) -> Crew:
        """Create the MarketingAnalystCrew crew"""
        return Crew(
            agents = self.agents,
            tasks = self.tasks,
            process = Process.sequential,
            verbose = 2
        )

jaypif commented 6 months ago

I miss the stacktrace:

## Welcome to Crew AI Template
-------------------------------
Fetching 10 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 27395.85it/s]
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
 [DEBUG]: == Working Agent: Senior Research Analyst
 [INFO]: == Starting Task: Collect and summarize recent news articles, press releases, and market analyses related to the company. Pay special attention to any significant events, market, sentiments, and analysts' opinions.
Your final answer MUST be a report that includes a comprehensive identification of key points makrting oriented following the Marketing 5Ps (Product, Place, Price, Promotion, People)
If you do your BEST WORK, I will give you a $10,000 commission!
Make sure to use the most recent data as possible.
Selected company by the customer is Tailwarden

> Entering new CrewAgentExecutor chain...
Traceback (most recent call last):
  File "/Users/jp/dev/AI/crewAI/competition-research/main.py", line 79, in <module>
    result = run('Tailwarden')
  File "/Users/jp/dev/AI/crewAI/competition-research/main.py", line 70, in run
    MarketingAnalystCrew().crew().kickoff(inputs=inputs)
  File "/opt/homebrew/lib/python3.10/site-packages/crewai/crew.py", line 204, in kickoff
    result = self._run_sequential_process()
  File "/opt/homebrew/lib/python3.10/site-packages/crewai/crew.py", line 240, in _run_sequential_process
    output = task.execute(context=task_output)
  File "/opt/homebrew/lib/python3.10/site-packages/crewai/task.py", line 148, in execute
    result = self._execute(
  File "/opt/homebrew/lib/python3.10/site-packages/crewai/task.py", line 157, in _execute
    result = agent.execute_task(
  File "/opt/homebrew/lib/python3.10/site-packages/crewai/agent.py", line 193, in execute_task
    result = self.agent_executor.invoke(
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/chains/base.py", line 163, in invoke
    raise e
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/chains/base.py", line 153, in invoke
    self._call(inputs, run_manager=run_manager)
  File "/opt/homebrew/lib/python3.10/site-packages/crewai/agents/executor.py", line 64, in _call
    next_step_output = self._take_next_step(
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/agents/agent.py", line 1138, in _take_next_step
    [
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/agents/agent.py", line 1138, in <listcomp>
    [
  File "/opt/homebrew/lib/python3.10/site-packages/crewai/agents/executor.py", line 118, in _iter_next_step
    output = self.agent.plan(
  File "/opt/homebrew/lib/python3.10/site-packages/langchain/agents/agent.py", line 397, in plan
    for chunk in self.runnable.stream(inputs, config={"callbacks": callbacks}):
  File "/opt/homebrew/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2875, in stream
    yield from self.transform(iter([input]), config, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2862, in transform
    yield from self._transform_stream_with_config(
  File "/opt/homebrew/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 1880, in _transform_stream_with_config
    chunk: Output = context.run(next, iterator)  # type: ignore
  File "/opt/homebrew/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2826, in _transform
    for output in final_pipeline:
  File "/opt/homebrew/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 1283, in transform
    for chunk in input:
  File "/opt/homebrew/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 4722, in transform
    yield from self.bound.transform(
  File "/opt/homebrew/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 1300, in transform
    yield from self.stream(final, config, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 458, in stream
    raise e
  File "/opt/homebrew/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 442, in stream
    for chunk in self._stream(
  File "/opt/homebrew/lib/python3.10/site-packages/langchain_community/llms/mlx_pipeline.py", line 188, in _stream
    text = self.tokenizer.decode(token.items())
AttributeError: 'int' object has no attribute 'items'

jaypif commented 6 months ago

I added a print to undestand what I have for this part of the code:

        for (token, prob), n in zip(
            generate_step(
                prompt_tokens,
                self.model,
                temp,
                repetition_penalty,
                repetition_context_size,
            ),
            range(max_new_tokens),
        ):

Debug part:

test = list(zip(
            generate_step(
                prompt_tokens,
                self.model,
                temp,
                repetition_penalty,
                repetition_context_size,
            ),
            range(max_new_tokens),
        ))
print(test)

Result:

[((1014, array([0.0193481], dtype=float16)), 0), ((907, array([0.773926], dtype=float16)), 1), ((2992, array([0.055603], dtype=float16)), 2), ((349, array([0.260742], dtype=float16)), 3), ((298, array([0.928223], dtype=float16)), 4), ((5902, array([0.150024], dtype=float16)), 5), ((5391, array([0.303223], dtype=float16)), 6), ((4231, array([0.975098], dtype=float16)), 7), ((10437, array([0.893555], dtype=float16)), 8), ((28725, array([0.492188], dtype=float16)), 9), ((2944, array([0.991211], dtype=float16)), 10), ((21446, array([0.999023], dtype=float16)), 11), ((28725, array([0.905273], dtype=float16)), 12), ((304, array([0.998047], dtype=float16)), 13), ((2668, array([0.995117], dtype=float16)), 14), ((21974, array([0.988281], dtype=float16)), 15), ((274, array([1], dtype=float16)), 16), ((5202, array([0.914062], dtype=float16)), 17), ((298, array([0.999023], dtype=float16)), 18), ((272, array([0.749023], dtype=float16)), 19), ((2496, array([0.969727], dtype=float16)), 20), ((28723, array([0.789062], dtype=float16)), 21), ((13, array([0.648926], dtype=float16)), 22), ((13, array([0.451416], dtype=float16)), 23), ((3795, array([0.974121], dtype=float16)), 24), ((28747, array([0.995117], dtype=float16)), 25), ((7615, array([0.766602], dtype=float16)), 26), ((11664, array([0.99707], dtype=float16)), 27), ((12877, array([0.998047], dtype=float16)), 28), ((13, array([0.961426], dtype=float16)), 29), ((3795, array([0.986328], dtype=float16)), 30), ((11232, array([1], dtype=float16)), 31), ((28747, array([0.998047], dtype=float16)), 32), ((371, array([0.419922], dtype=float16)), 33), ((3385, array([0.324707], dtype=float16)), 34), ((28747, array([0.98291], dtype=float16)), 35), ((464, array([0.74707], dtype=float16)), 36), ((28738, array([0.766602], dtype=float16)), 37), ((614, array([1], dtype=float16)), 38), ((1050, array([1], dtype=float16)), 39), ((269, array([1], dtype=float16)), 40), ((14491, array([0.338867], dtype=float16)), 41), ((13, array([0.995117], dtype=float16)), 42), ((23044, array([0.912598], dtype=float16)), 43), ((352, array([1], dtype=float16)), 44), ((28747, array([1], dtype=float16)), 45), ((7615, array([0.0856323], dtype=float16)), 46), ((10437, array([0.637207], dtype=float16)), 47), ((304, array([0.275146], dtype=float16)), 48), ((2944, array([0.954102], dtype=float16)), 49), ((21446, array([0.998047], dtype=float16)), 50), ((684, array([0.275635], dtype=float16)), 51), ((320, array([0.944824], dtype=float16)), 52), ((614, array([1], dtype=float16)), 53), ((1050, array([1], dtype=float16)), 54), ((269, array([1], dtype=float16)), 55), ((28723, array([0.277344], dtype=float16)), 56), ((13, array([0.942871], dtype=float16)), 57), ((13, array([0.98291], dtype=float16)), 58), ((1227, array([0.848145], dtype=float16)), 59), ((1322, array([1], dtype=float16)), 60), ((28747, array([1], dtype=float16)), 61), ((2961, array([0.0710449], dtype=float16)), 62), ((369, array([0.312988], dtype=float16)), 63), ((478, array([0.528809], dtype=float16)), 64), ((506, array([0.9375], dtype=float16)), 65), ((272, array([0.562988], dtype=float16)), 66), ((4231, array([0.664062], dtype=float16)), 67), ((10437, array([0.67334], dtype=float16)), 68), ((28725, array([0.609375], dtype=float16)), 69), ((2944, array([0.0377197], dtype=float16)), 70), ((21446, array([0.999023], dtype=float16)), 71), ((28725, array([0.884277], dtype=float16)), 72), ((304, array([0.990234], dtype=float16)), 73), ((2668, array([0.987305], dtype=float16)), 74), ((21974, array([0.988281], dtype=float16)), 75), ((274, array([1], dtype=float16)), 76), ((28725, array([0.717285], dtype=float16)), 77), ((1346, array([0.157837], dtype=float16)), 78), ((28742, array([0.978027], dtype=float16)), 79), ((28713, array([1], dtype=float16)), 80), ((9051, array([0.0745239], dtype=float16)), 81), ((707, array([0.524414], dtype=float16)), 82), ((5864, array([0.994141], dtype=float16)), 83), ((3926, array([0.998047], dtype=float16)), 84), ((304, array([0.153442], dtype=float16)), 85), ((2668, array([0.870605], dtype=float16)), 86), ((2662, array([0.972656], dtype=float16)), 87), ((8447, array([1], dtype=float16)), 88), ((28723, array([0.802734], dtype=float16)), 89), ((13, array([0.974121], dtype=float16)), 90), ((13, array([0.9375], dtype=float16)), 91), ((3795, array([0.99707], dtype=float16)), 92), ((28747, array([1], dtype=float16)), 93), ((11147, array([0.977051], dtype=float16)), 94), ((272, array([0.999023], dtype=float16)), 95), ((7865, array([0.99707], dtype=float16)), 96), ((13, array([0.999023], dtype=float16)), 97), ((3795, array([1], dtype=float16)), 98), ((11232, array([1], dtype=float16)), 99)]

Hope this help

Blaizzy commented 6 months ago

Thanks! I managed to replicate the issue :)

Blaizzy commented 6 months ago

I know what happened. Last week .item() was added directly into MLX-LM. I will fix it.

jaypif commented 6 months ago

Thanks !

jaypif commented 6 months ago

Hi @Blaizzy

Thank you for the fix

I tried it directly for testing purpose.

It fixes the "item" issue.

However I have a weird behavior. results are "trimmed": no space caracter

Example:

 [DEBUG]: == Working Agent: Senior Research Analyst
 [INFO]: == Starting Task: Collect and summarize recent news articles, press releases, and market analyses related to the company. Pay special attention to any significant events, market, sentiments, and analysts' opinions.
Your final answer MUST be a report that includes a comprehensive identification of key points makrting oriented following the Marketing 5Ps (Product, Place, Price, Promotion, People)
If you do your BEST WORK, I will give you a $10,000 commission!
Make sure to use the most recent data as possible.
Selected company by the customer is Tesla

> Entering new CrewAgentExecutor chain...
Ishouldstartbycollectingthemostrecentnewsarticles,pressreleases,andmarketanalysesrelatedtothecompany.
Action:NewsDBTool
ActionInput:{"query":"Teslanewsarticles"}
Observation:ThemostrecentnewsarticlesrelatedtoTeslaareretrieved.

If I run the model with the following command: python3.10 -m mlx_lm.generate --model mlx-community/dolphin-2.8-mistral-7b-v02 --prompt "find information about Tesla company"

I have no space trim issue:

Fetching 10 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 91578.69it/s]
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
==========
Prompt: <|im_start|>user
find information about Tesla company<|im_end|>
<|im_start|>assistant

Tesla, Inc. is an American electric vehicle and clean energy company founded by Elon Musk in 2003. The company's mission is to accelerate the world's transition to sustainable energy through electric vehicles, solar panels, and energy storage systems. Tesla designs, manufactures, and sells electric vehicles, solar panels, and energy storage systems, as well as provides related services, such as vehicle charging and energy storage products.

Maybe a tokenizer issue ?

Or is it related to the zip usage that is joining words without spaces ?

Blaizzy commented 6 months ago

Thanks for bringing this up!

I fixed it :), this was another change. One you will like because it makes streaming faster.

I didn't originally notice it because I was printing stream like this:

for text in chain.stream({"question": question}):
    print(text)

But saw the problem when I did:

print(''.join(list(chain.stream({"question": question}))))

Blaizzy commented 6 months ago

@jaypif please try out the latest change and let me know if it works well with crewAI

jaypif commented 6 months ago

Thank you!

It works very well now.

Blaizzy commented 6 months ago

Most welcome!

Please share your example on twitter and tag me, I'll repost it:

https://twitter.com/Prince_Canuma

jaypif commented 6 months ago

I will and I will also write a LinkedIn post about my first contribution to langchain and mlx.

Thanks for your prompt feedback

Blaizzy commented 6 months ago

Awesome, my LinkedIn is the same name.

Prince Canuma

Great work!

langchain-ai / langchain