train project - Githubissues

xiaotuoluo11 commented 1 year ago

运行指令： deepspeed --master_port=24999 train_ds.py --version='./llava_path' --dataset_dir='./dataset' --vision_pretrained='sam_weights' --dataset='ade20k' --sample_rates="9,3,3,1" --exp_name="lisa-7b" --load_in_8bit

xiaotuoluo11 commented 1 year ago

运行指令： deepspeed --master_port=24999 train_ds.py --version='./llava_path' --dataset_dir='./dataset' --vision_pretrained='sam_weights' --dataset='ade20k' --sample_rates="9,3,3,1" --exp_name="lisa-7b" --load_in_8bit 下面是llava_path 下的文件 config.json mm_projector.bin pytorch_model-00002-of-00003.bin pytorch_model.bin.index.json tokenizer_config.json generation_config.json pytorch_model-00001-of-00003.bin pytorch_model-00003-of-00003.bin special_tokens_map.json tokenizer.model 使用了上述指令和权重文件报错： You are using the legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565 normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization. You are using a model of type llama to instantiate a model of type llava. This is not supported for all configurations of models and can yield errors. Traceback (most recent call last): File "/home/user/project/detgpt/LISA/train_ds.py", line 461, in main(sys.argv[1:]) File "/home/user/project/detgpt/LISA/train_ds.py", line 122, in main model = LISA( File "/home/user/project/detgpt/LISA/model/LISA.py", line 104, in init self.lm = LlavaLlamaForCausalLM.from_pretrained( File "/home/user/miniconda3/envs/python_3.9/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2474, in from_pretrained raise EnvironmentError( OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory ./llava_path. You are using the legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565 normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization. You are using a model of type llama to instantiate a model of type llava. This is not supported for all configurations of models and can yield errors. Traceback (most recent call last): File "/home/user/project/detgpt/LISA/train_ds.py", line 461, in main(sys.argv[1:]) File "/home/user/project/detgpt/LISA/train_ds.py", line 122, in main model = LISA( File "/home/user/project/detgpt/LISA/model/LISA.py", line 104, in init self.lm = LlavaLlamaForCausalLM.from_pretrained( File "/home/user/miniconda3/envs/python_3.9/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2474, in from_pretrained raise EnvironmentError( OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory ./llava_path. 这个是什么问题，还需要什么文件

X-Lai commented 1 year ago

应该是llava没有下载或者转换成功，建议检查一下。另外，训练不能使用--load_in_8bit 命令。

另外我们重构了新版的代码，欢迎使用！

xiaotuoluo11 commented 1 year ago

llava权重按照说明下载完了，并且在上面进行了列举，您看看缺少那些文件

AmrinKareem commented 11 months ago

Hi @X-Lai @xiaotuoluo11 , did you solve this issue - I have been trying to train the model on cloud, and I get this error :

Traceback (most recent call last): File "/opt/ml/code/train_ds.py", line 593, in main(sys.argv[1:]) File "/opt/ml/code/train_ds.py", line 124, in main tokenizer = transformers.AutoTokenizer.from_pretrained( File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 652, in from_pretrained tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs) File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 496, in get_tokenizer_config resolved_config_file = cached_file( File "/opt/conda/lib/python3.10/site-packages/transformers/utils/hub.py", line 417, in cached_file resolved_file = hf_hub_download( File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 110, in _inner_fn validate_repo_id(arg_value) File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 158, in validate_repo_id raise HFValidationError( huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/tmp/LISA_ft/downloads'. Use repo_type argument if needed.

My train script is this: deepspeed --master_port=24999 /opt/ml/code/train_ds.py \ --version="/tmp/LISA_ft/downloads" \ --dataset_dir='/tmp/LISA_ft/dataset' \ --vision_pretrained="/opt/ml/code/sam_vit_h_4b8939.pth" \ --dataset="sem_seg||refer_seg||vqa||reason_seg" \ --sample_rates="9,3,3,1" \ --exp_name="lisa-13b" \ --batch_size="2" \

I tried setting the TRANSFORMERS_CACHE to /tmp/LISA_ft directory as well. However, it gave a different error: OSError: /tmp/LISA_ft/downloads does not appear to have a file named config.json. Checkout 'https://huggingface.co//tmp/LISA_ft/downloads/None' for available files.

Can you please help identify the issue?

dvlab-research / LISA

train project #35