something went wrong: Response does not contain codes!

TorAllex commented 4 months ago

got prompt
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:44<00:00, 22.17s/it]
Some weights of the model checkpoint at /Users/alex/ComfyUI/models/omost were not used when initializing LlamaForCausalLM: ['model.layers.13.mlp.up_proj.weight.quant_map', 'model.layers.26.self_attn.q_proj.weight.nested_absmax', 'model.layers.10.self_attn.o_proj.weight.nested_quant_map',

- This IS expected if you are initializing LlamaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LlamaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
'PreTrainedTokenizerFast' object has no attribute 'apply_chat_template'
!!! Exception during processing!!! Response does not contain codes!
Traceback (most recent call last):
  File "/Users/alex/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/Users/alex/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/Users/alex/ComfyUI/execution.py", line 65, in map_node_over_list
    results.append(getattr(obj, func)(**input_data_all))
  File "/Users/alex/ComfyUI/custom_nodes/comfyui_LLM_party/tools/omost.py", line 46, in notify
    canvas = omost_canvas.from_bot_response(text[0])
  File "/Users/alex/ComfyUI/custom_nodes/comfyui_LLM_party/lib_omost/canvas.py", line 134, in from_bot_response
    assert matched, 'Response does not contain codes!'
AssertionError: Response does not contain codes!

heshengtao commented 4 months ago

When you use omost, if the output program is truncated, this error will be reported. You can increase the max length to prevent the LLM output from being truncated. If you don't use an omost-specific model, but use another model instead, it is very likely that other models will not be able to output code normally, and this error will also be reported. If your problem has not been solved, please attach a full screenshot of your workflow.

TorAllex commented 4 months ago

Yes, I've try use omost. Just I've load your "start_with_OMOST.json" workflow and click "Queue Prompt", then got error. Of course, this can be a hardware problem, since I use a Mac (Intel) and AMD graphics card. Can you please explain to me, where is the "increase the maximum length" option? I'm a beginner and I'm not sure where to find it.

heshengtao commented 4 months ago

屏幕截图 2024-06-27 171938 LLM_local node has a max length parameter, 2048 in the figure, please check the show text node connected to the LLM_local node, the output text is not the complete code, the complete code will start with python```and end with```

TorAllex commented 4 months ago

I tried to double the maximum length up to 32768, but it didn't fix the error. Text node message: "PreTrainedTokenizerFast" object has no attribute "apply_chat_template".

heshengtao commented 4 months ago

path_in_launcher_configuration\python_embeded\python.exe -m pip install --upgrade transformers

Here path_in_launcher_configuration\python_embeded\python.exe is the interpreter path of comfyui

TorAllex commented 4 months ago

I'll try an upgrade transformers, I need to see how to do it in MacOs

TorAllex commented 4 months ago

Ok, now the error looks different - "gpu not found". Does the device MPS not work with it?

heshengtao commented 4 months ago

If you choose auto, it will automatically find cuda. If there is no cuda, it will be mps. If it is not found, it is cpu. If it is an error that the gpu not found, this error message does not appear in my code. Maybe the code of transformer or the model itself has restrictions on using cuda. Some int4 quantization models do have poor compatibility with MPS devices.I'm a little helpless about this problem you have. AMD graphics cards are really not suitable for artificial intelligence.

TorAllex commented 4 months ago

Снимок экрана 2024-06-28 в 22 53 28

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>.
!!! Exception during processing!!! No GPU found. A GPU is needed for quantization.
Traceback (most recent call last):
  File "/Users/alex/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/Users/alex/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/Users/alex/ComfyUI/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/Users/alex/ComfyUI/custom_nodes/comfyui_LLM_party/llm.py", line 937, in chatbot
    self.model = AutoModelForCausalLM.from_pretrained(
  File "/Users/alex/miniconda3/envs/fooocus/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
  File "/Users/alex/miniconda3/envs/fooocus/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3279, in from_pretrained
    hf_quantizer.validate_environment(
  File "/Users/alex/miniconda3/envs/fooocus/lib/python3.10/site-packages/transformers/quantizers/quantizer_bnb_4bit.py", line 62, in validate_environment
    raise RuntimeError("No GPU found. A GPU is needed for quantization.")
RuntimeError: No GPU found. A GPU is needed for quantization.

Prompt executed in 0.39 seconds

heshengtao commented 4 months ago

Try this. transformers>=4.41.1 bitsandbytes==0.43.1 accelerate==0.30.1

TorAllex commented 4 months ago

I have Python 3.10.14 installed, and bitsandbytes 0.42.0 latest. transformers and accelerate - Ok should I upgrade python?

heshengtao commented 4 months ago

Please match the versions of the three libraries I gave, there is a high probability that doing so will solve this problem. bitsandbytes 0.42.0 version is not quite right, please adjust to bitsandbytes == 0.43.1 Accelerate == 0.30.1

TorAllex commented 4 months ago

alex@iMac ComfyUI % pip show bitsandbytes         
Name: bitsandbytes
Version: 0.42.0
-------------------------------------------------------
alex@iMac ComfyUI % pip show accelerate  
Name: accelerate
Version: 0.30.1
-------------------------------------------------------
alex@iMac ComfyUI % pip show transformers
Name: transformers
Version: 4.41.1
-------------------------------------------------------
alex@iMac ComfyUI % pip install bitsandbytes==0.43.1
ERROR: Could not find a version that satisfies the requirement bitsandbytes==0.43.1 (from versions: 0.31.8, 0.32.0, 0.32.1, 0.32.2, 0.32.3, 0.33.0, 0.33.1, 0.34.0, 0.35.0, 0.35.1, 0.35.2, 0.35.3, 0.35.4, 0.36.0, 0.36.0.post1, 0.36.0.post2, 0.37.0, 0.37.1, 0.37.2, 0.38.0, 0.38.0.post1, 0.38.0.post2, 0.38.1, 0.39.0, 0.39.1, 0.40.0, 0.40.0.post1, 0.40.0.post2, 0.40.0.post3, 0.40.0.post4, 0.40.1, 0.40.1.post1, 0.40.2, 0.41.0, 0.41.1, 0.41.2, 0.41.2.post1, 0.41.2.post2, 0.41.3, 0.41.3.post1, 0.41.3.post2, 0.42.0)
ERROR: No matching distribution found for bitsandbytes==0.43.1

heshengtao commented 4 months ago

Either try updating Python, but unfortunately, bitsandbytes requires cuda support. I doubt you can use this int4 model because bitsandbytes is the foundation for using this quantization model.

TorAllex commented 4 months ago

May be, it can works with ROCm (Vulkan) device?

TorAllex commented 4 months ago

I started playing with python versions and now I've broken everything. ERROR: llama_cpp_python-0.2.79-AVX2-macosx_13_0_x86_64.whl is not a valid wheel filename. In install.py code response = get("https://api.github.com/repos/abetlen/llama-cpp-python/releases/latest") but there is no files with AVX2 or AVX. I've install manually pip install --upgrade --no-cache-dir llama-cpp-python, but it ignored by install.py

heshengtao commented 4 months ago

In my code, if you correctly install llama-cpp-python or llama_cpp, the program that automatically installs llama-cpp-python will be skipped. And "ERROR: llama_cpp_python - 0.2.79 - AVX2 - macosx_13_0_x86_64 .whl is not a valid wheel filename." The installer failed to recognize your mps device. Please check if you really have llama-cpp-python in your environment.

    Imported = package_is_installed ("llama-cpp-python") or package_is_installed ("llama_cpp")
    If imported:
        # If it is already installed, do nothing
        Pass

You can see that this part of my code, if there is llama_cpp_python in the environment, the installation program will not be executed. As far as I know, bitsandbytes is a library that can only run on cuda, so no matter how you configure the environment, you won't be able to use the model quantified by bitsandbytes.

TorAllex commented 4 months ago

Ok, we just need to wait for support https://github.com/TimDettmers/bitsandbytes/issues/252#issuecomment-2012563160

heshengtao / comfyui_LLM_party

something went wrong: Response does not contain codes! #27