Closed Hassaan68 closed 5 months ago
Only the multimodal llms accept visual inputs, but llama3 not. You can use LLava instead
Only the multimodal llms accept visual inputs, but llama3 not. You can use LLava instead
i met same erro.
I run examples/train_lora/qwen2vl_lora_sft.yaml
show erro.
Traceback (most recent call last): File "/home/sevnce/anaconda3/envs/llama_factory/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/home/sevnce/anaconda3/envs/llama_factory/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 678, in _write_generator_to_queue for i, result in enumerate(func(**kwargs)): File "/home/sevnce/anaconda3/envs/llama_factory/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3558, in _map_single batch = apply_function_on_filtered_inputs( File "/home/sevnce/anaconda3/envs/llama_factory/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3427, in apply_function_on_filtered_inputs processed_inputs = function(*fn_args, *additional_args, **fn_kwargs) File "/sevnce/大模型/LLaMA-Factory-main/src/llamafactory/data/processors/supervised.py", line 104, in preprocess_supervised_dataset prompt = template.mm_plugin.process_messages(examples["prompt"][i], examples["images"][i], processor) File "/sevnce/大模型/LLaMA-Factory-main/src/llamafactory/data/mm_plugin.py", line 231, in process_messages image_processor: "BaseImageProcessor" = getattr(processor, "image_processor") AttributeError: 'Qwen2TokenizerFast' object has no attribute 'image_processor
i look image_processor: "BaseImageProcessor" = getattr(processor, "image_processor")
processor no attribute "image_processor"
hi @827346462 . You can try by setting visual_inputs to True if you get the error while using model which supports the visual inputs.
hi @827346462 . You can try by setting visual_inputs to True if you get the error while using model which supports the visual inputs.
“qwen2vl_lora_sft.yaml” is true
I think this a bug ,i print processor type, "image_processor: "BaseImageProcessor" = getattr(processor, "image_processor")" processor is "transformers.models.qwen2.tokenization_qwen2_fast.Qwen2TokenizerFast" but this is image_process , Should it be used "transformers.models.qwen2.tokenization_qwen2_fast.image_processing_qwen2_vl.py" Qwen2VLImageProcessor
hi @827346462 . You can try by setting visual_inputs to True if you get the error while using model which supports the visual inputs.
“qwen2vl_lora_sft.yaml” is true
I think this a bug ,i print processor type, "image_processor: "BaseImageProcessor" = getattr(processor, "image_processor")" processor is "transformers.models.qwen2.tokenization_qwen2_fast.Qwen2TokenizerFast" but this is image_process , Should it be used "transformers.models.qwen2.tokenization_qwen2_fast.image_processing_qwen2_vl.py" Qwen2VLImageProcessor
hi @827346462 . You can try by setting visual_inputs to True if you get the error while using model which supports the visual inputs.
“qwen2vl_lora_sft.yaml” is true
I think this a bug ,i print processor type, "image_processor: "BaseImageProcessor" = getattr(processor, "image_processor")" processor is "transformers.models.qwen2.tokenization_qwen2_fast.Qwen2TokenizerFast" but this is image_process , Should it be used "transformers.models.qwen2.tokenization_qwen2_fast.image_processing_qwen2_vl.py" Qwen2VLImageProcessor
安装最新的transformer==4.45.dev0可以解决这个问题
Reminder
System Info
Trying to train Llama3-8B with image dataset and followed all steps but getting the error below when I start the training. I am using the recommended version of transformers library
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/opt/conda/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/home/sagemaker-user/LLaMA-Factory/src/llamafactory/cli.py", line 96, in main
run_exp()
File "/home/sagemaker-user/LLaMA-Factory/src/llamafactory/train/tuner.py", line 33, in run_exp
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/home/sagemaker-user/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 35, in run_sft
dataset = get_dataset(model_args, data_args, training_args, stage="sft", tokenizer_module)
File "/home/sagemaker-user/LLaMA-Factory/src/llamafactory/data/loader.py", line 176, in get_dataset
dataset = dataset.map(preprocess_func, batched=True, remove_columns=column_names, kwargs)
File "/opt/conda/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 602, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, kwargs)
File "/opt/conda/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 567, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, *kwargs)
File "/opt/conda/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3248, in map
for rank, done, content in iflatmap_unordered(
File "/opt/conda/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 718, in iflatmap_unordered
[async_result.get(timeout=0.05) for async_result in async_results]
File "/opt/conda/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 718, in
[async_result.get(timeout=0.05) for async_result in async_results]
File "/opt/conda/lib/python3.10/site-packages/multiprocess/pool.py", line 774, in get
raise self._value
File "/opt/conda/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker
result = (True, func( args, kwds))
File "/opt/conda/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 678, in _write_generator_to_queue
for i, result in enumerate(func(kwargs)):
File "/opt/conda/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3547, in _map_single
batch = apply_function_on_filtered_inputs(
File "/opt/conda/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3416, in apply_function_on_filtered_inputs
processed_inputs = function(fn_args, additional_args, fn_kwargs)
File "/home/sagemaker-user/LLaMA-Factory/src/llamafactory/data/processors/supervised.py", line 78, in preprocess_supervised_dataset
model_inputs["pixel_values"].append(get_pixel_values(examples["images"][i], processor))
File "/home/sagemaker-user/LLaMA-Factory/src/llamafactory/data/processors/mm_utils.py", line 19, in get_pixel_values
image_processor: "BaseImageProcessor" = getattr(processor, "image_processor")
AttributeError: 'PreTrainedTokenizerFast' object has no attribute 'image_processor'
Reproduction
Train the Llama3-8B with with images dataset. Tick visual inputs from Advance configuration before training and select flash-attn as booster
Expected behavior
Model should be trained successfully
Others
No response