WangZY1111 commented 6 months ago

Traceback (most recent call last): File "finetune_visualglm.py", line 194, in training_main(args, model_cls=model, forward_step_function=forward_step, create_dataset_function=create_dataset_function, collate_fn=data_collator) File "/home/wangzy/anaconda3/envs/LVM/lib/python3.8/site-packages/sat/training/deepspeed_training.py", line 67, in training_main train_data, val_data, test_data = make_loaders(args, hooks['create_dataset_function'], collate_fn=collate_fn) File "/home/wangzy/anaconda3/envs/LVM/lib/python3.8/site-packages/sat/data_utils/configure_data.py", line 200, in make_loaders train = make_dataset(**data_set_args, args=args, dataset_weights=args.train_data_weights, is_train_data=True) File "/home/wangzy/anaconda3/envs/LVM/lib/python3.8/site-packages/sat/data_utils/configure_data.py", line 126, in make_dataset_full d = create_dataset_function(p, args) File "finetune_visualglm.py", line 160, in create_dataset_function dataset = FewShotDataset(path, image_processor, tokenizer, args) File "finetune_visualglm.py", line 118, in init input0 = tokenizer.encode("", add_special_tokens=False) AttributeError: 'FakeTokenizer' object has no attribute 'encode'

how can I solve it

Tom98714 commented 6 months ago

你好，这个问题解决了吗，我也遇到这个问题了

xiongxiaochu commented 5 months ago

遇到这个问题+1，请问解决了吗？

Tom98714 commented 5 months ago

找到了问题。还没有彻底解决

eternal ？_frank_test @.***

------------------ 原始邮件 ------------------ 发件人: "THUDM/VisualGLM-6B" @.>; 发送时间: 2024年1月30日(星期二) 下午4:21 @.>; 抄送: "Future @.**@.>; 主题: Re: [THUDM/VisualGLM-6B] AttributeError: 'FakeTokenizer' object has no attribute 'encode' (Issue #335)

遇到这个问题+1，请问解决了吗？

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Tom98714 commented 5 months ago

遇到这个问题+1，请问解决了吗？

请检查所有的模型路径和分词器路径是否为本地路径，默认是THUDM/Visualglm-6b，如果本地运行的话，需要把路径修改到本地的相应位置，如果能直接访问到huggingface，应该不会发生这个问题

xiongxiaochu commented 5 months ago

遇到这个问题+1，请问解决了吗？

请检查所有的模型路径和分词器路径是否为本地路径，默认是THUDM/Visualglm-6b，如果本地运行的话，需要把路径修改到本地的相应位置，如果能直接访问到huggingface，应该不会发生这个问题

visual-glm修改成了本地路径，chatglm那个没看到修改的位置

Tom98714 commented 5 months ago

遇到这个问题+1，请问解决了吗？

请检查所有的模型路径和分词器路径是否为本地路径，默认是THUDM/Visualglm-6b，如果本地运行的话，需要把路径修改到本地的相应位置，如果能直接访问到huggingface，应该不会发生这个问题

都修改成了本地路径，我们这边机器无法访问huggingface

你需要提前把所需要的模型，和分词器都下载到本地，然后在把路径修改到相应路径这里是sat模型下载链接 https://www.wisemodel.cn/models/ZhipuAI/VisualGLM-6B-SAT/file

Tom98714 commented 5 months ago

遇到这个问题+1，请问解决了吗？

请检查所有的模型路径和分词器路径是否为本地路径，默认是THUDM/Visualglm-6b，如果本地运行的话，需要把路径修改到本地的相应位置，如果能直接访问到huggingface，应该不会发生这个问题

visual-glm修改成了本地路径，chatglm那个没看到修改的位置

这里是huggingface的镜像网站，请仔细阅读 https://hf-mirror.com/

xiongxiaochu commented 5 months ago

遇到这个问题+1，请问解决了吗？

请检查所有的模型路径和分词器路径是否为本地路径，默认是THUDM/Visualglm-6b，如果本地运行的话，需要把路径修改到本地的相应位置，如果能直接访问到huggingface，应该不会发生这个问题

都修改成了本地路径，我们这边机器无法访问huggingface

你需要提前把所需要的模型，和分词器都下载到本地，然后在把路径修改到相应路径这里是sat模型下载链接 https://www.wisemodel.cn/models/ZhipuAI/VisualGLM-6B-SAT/file

通过cli-demo下载了，长这样是对的吗？

Tom98714 commented 5 months ago

遇到这个问题+1，请问解决了吗？

请检查所有的模型路径和分词器路径是否为本地路径，默认是THUDM/Visualglm-6b，如果本地运行的话，需要把路径修改到本地的相应位置，如果能直接访问到huggingface，应该不会发生这个问题

都修改成了本地路径，我们这边机器无法访问huggingface

你需要提前把所需要的模型，和分词器都下载到本地，然后在把路径修改到相应路径这里是sat模型下载链接 https://www.wisemodel.cn/models/ZhipuAI/VisualGLM-6B-SAT/file

通过cli-demo下载了，长这样是对的吗？

是的

xiongxiaochu commented 5 months ago

https://hf-mirror.com/

那分词器是也需要放在visualglm-6b文件夹下吗？

Tom98714 commented 5 months ago

https://hf-mirror.com/

那分词器是也需要放在visualglm-6b文件夹下吗？

是的，全都需要，都需要提前准备好，并修改相应配置文件、代码中的路径

xiongxiaochu commented 5 months ago

https://hf-mirror.com/

那分词器是也需要放在visualglm-6b文件夹下吗？

是的，全都需要，都需要提前准备好，并修改相应配置文件、代码中的路径

收到，把model_config.json里的tokenizer_type改成本地chatglm的路径就可以了~

BUZZ0328 commented 1 month ago

https://hf-mirror.com/

那分词器是也需要放在visualglm-6b文件夹下吗？

是的，全都需要，都需要提前准备好，并修改相应配置文件、代码中的路径

收到，把model_config.json里的tokenizer_type改成本地chatglm的路径就可以了~

请问下载的模型和分词器是放在如图上放置吗，还缺什么文件吗，model_config.json里的tokenizer_type这里修改为/home/root1/data/glm/VisualGLM-6B/THUDM/visualglm-6b（这是我图上的路径），tokenizer_config.json里的name_or_path也一样修改对吗？

THUDM / VisualGLM-6B

AttributeError: 'FakeTokenizer' object has no attribute 'encode' #335

请问下载的模型和分词器是放在如图上放置吗，还缺什么文件吗，model_config.json里的tokenizer_type这里修改为/home/root1/data/glm/VisualGLM-6B/THUDM/visualglm-6b（这是我图上的路径），tokenizer_config.json里的name_or_path也一样修改对吗？