Open TorAllex opened 4 months ago
When you use omost, if the output program is truncated, this error will be reported. You can increase the max length to prevent the LLM output from being truncated. If you don't use an omost-specific model, but use another model instead, it is very likely that other models will not be able to output code normally, and this error will also be reported. If your problem has not been solved, please attach a full screenshot of your workflow.
Yes, I've try use omost. Just I've load your "start_with_OMOST.json" workflow and click "Queue Prompt", then got error. Of course, this can be a hardware problem, since I use a Mac (Intel) and AMD graphics card. Can you please explain to me, where is the "increase the maximum length" option? I'm a beginner and I'm not sure where to find it.
LLM_local node has a max length parameter, 2048 in the figure, please check the show text node connected to the LLM_local node, the output text is not the complete code, the complete code will start with python```and end with```
I tried to double the maximum length up to 32768, but it didn't fix the error. Text node message: "PreTrainedTokenizerFast" object has no attribute "apply_chat_template".
path_in_launcher_configuration\python_embeded\python.exe -m pip install --upgrade transformers
Here path_in_launcher_configuration\python_embeded\python.exe
is the interpreter path of comfyui
I'll try an upgrade transformers, I need to see how to do it in MacOs
Ok, now the error looks different - "gpu not found". Does the device MPS not work with it?
If you choose auto
, it will automatically find cuda
. If there is no cuda
, it will be mps
. If it is not found, it is cpu
. If it is an error that the gpu not found
, this error message does not appear in my code. Maybe the code of transformer
or the model itself has restrictions on using cuda. Some int4 quantization models do have poor compatibility with MPS devices.I'm a little helpless about this problem you have. AMD graphics cards are really not suitable for artificial intelligence.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>.
!!! Exception during processing!!! No GPU found. A GPU is needed for quantization.
Traceback (most recent call last):
File "/Users/alex/ComfyUI/execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "/Users/alex/ComfyUI/execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "/Users/alex/ComfyUI/execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "/Users/alex/ComfyUI/custom_nodes/comfyui_LLM_party/llm.py", line 937, in chatbot
self.model = AutoModelForCausalLM.from_pretrained(
File "/Users/alex/miniconda3/envs/fooocus/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
File "/Users/alex/miniconda3/envs/fooocus/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3279, in from_pretrained
hf_quantizer.validate_environment(
File "/Users/alex/miniconda3/envs/fooocus/lib/python3.10/site-packages/transformers/quantizers/quantizer_bnb_4bit.py", line 62, in validate_environment
raise RuntimeError("No GPU found. A GPU is needed for quantization.")
RuntimeError: No GPU found. A GPU is needed for quantization.
Prompt executed in 0.39 seconds
Try this. transformers>=4.41.1 bitsandbytes==0.43.1 accelerate==0.30.1
I have Python 3.10.14 installed, and bitsandbytes 0.42.0 latest. transformers and accelerate - Ok should I upgrade python?
Please match the versions of the three libraries I gave, there is a high probability that doing so will solve this problem. bitsandbytes 0.42.0 version is not quite right, please adjust to bitsandbytes == 0.43.1 Accelerate == 0.30.1
alex@iMac ComfyUI % pip show bitsandbytes
Name: bitsandbytes
Version: 0.42.0
-------------------------------------------------------
alex@iMac ComfyUI % pip show accelerate
Name: accelerate
Version: 0.30.1
-------------------------------------------------------
alex@iMac ComfyUI % pip show transformers
Name: transformers
Version: 4.41.1
-------------------------------------------------------
alex@iMac ComfyUI % pip install bitsandbytes==0.43.1
ERROR: Could not find a version that satisfies the requirement bitsandbytes==0.43.1 (from versions: 0.31.8, 0.32.0, 0.32.1, 0.32.2, 0.32.3, 0.33.0, 0.33.1, 0.34.0, 0.35.0, 0.35.1, 0.35.2, 0.35.3, 0.35.4, 0.36.0, 0.36.0.post1, 0.36.0.post2, 0.37.0, 0.37.1, 0.37.2, 0.38.0, 0.38.0.post1, 0.38.0.post2, 0.38.1, 0.39.0, 0.39.1, 0.40.0, 0.40.0.post1, 0.40.0.post2, 0.40.0.post3, 0.40.0.post4, 0.40.1, 0.40.1.post1, 0.40.2, 0.41.0, 0.41.1, 0.41.2, 0.41.2.post1, 0.41.2.post2, 0.41.3, 0.41.3.post1, 0.41.3.post2, 0.42.0)
ERROR: No matching distribution found for bitsandbytes==0.43.1
Either try updating Python, but unfortunately, bitsandbytes requires cuda support. I doubt you can use this int4 model because bitsandbytes is the foundation for using this quantization model.
May be, it can works with ROCm (Vulkan) device?
I started playing with python versions and now I've broken everything.
ERROR: llama_cpp_python-0.2.79-AVX2-macosx_13_0_x86_64.whl is not a valid wheel filename.
In install.py
code response = get("https://api.github.com/repos/abetlen/llama-cpp-python/releases/latest")
but there is no files with AVX2 or AVX.
I've install manually pip install --upgrade --no-cache-dir llama-cpp-python
, but it ignored by install.py
In my code, if you correctly install llama-cpp-python or llama_cpp, the program that automatically installs llama-cpp-python will be skipped. And "ERROR: llama_cpp_python - 0.2.79 - AVX2 - macosx_13_0_x86_64 .whl is not a valid wheel filename." The installer failed to recognize your mps device. Please check if you really have llama-cpp-python in your environment.
Imported = package_is_installed ("llama-cpp-python") or package_is_installed ("llama_cpp")
If imported:
# If it is already installed, do nothing
Pass
You can see that this part of my code, if there is llama_cpp_python in the environment, the installation program will not be executed. As far as I know, bitsandbytes is a library that can only run on cuda, so no matter how you configure the environment, you won't be able to use the model quantified by bitsandbytes.
Ok, we just need to wait for support https://github.com/TimDettmers/bitsandbytes/issues/252#issuecomment-2012563160