zouy50 commented 1 year ago

The problem is: I only can see the output calculate by cpu, if I switch using device_type mps, it can run ok, but I cannot see the output, when I pprint(memory.load_memory_variables({})) there are many characters like this:

 'AI: \x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'
            '\x1c'

My environment is this: CPU and GPU: Apple M1 16G Python version: Python 3.11.4 llama-cpp-python == 0.1.78

I have tried to only run llama without langchain, the mps is run ok, and can output string, mps fast than cpu. So I don't know the problem is langchain or somewhat, and I don't know how to fix this, I really don't wanna use cpu, it's slow and make my Mac very hot.

zouy50 commented 1 year ago

Ok I add this param to llamaCpp function like this

            kwargs = {
                "model_path": model_path,
                "n_batch": 512,
                "n_ctx": max_ctx_size,
                # "max_tokens": max_ctx_size,
                # "echo": True,
                "verbose": True,
                "callback_manager": CallbackManager([StreamingStdOutCallbackHandler()]),
                "f16_kv": True
            }

the key param is this: "callback_manager": CallbackManager([StreamingStdOutCallbackHandler()])

langchain introduction page is : https://python.langchain.com/docs/guides/local_llms

zouy50 commented 1 year ago

I find the problem, the ingest.py has no param using mps, so I will fix this.

zouy50 commented 1 year ago

428 this pr is use to fix this

PromtEngineer / localGPT

Mac Metal Can run, but output nothing. #416

428 this pr is use to fix this