janleemark commented 1 year ago

采用命令运行：python finetune.py --datasets_path /Fengshenbang-LM/fengshen/examples/finetune_taiyi_stable_diffusion/demo_dataset --datasets_type txt --model_path models/Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1 --train_batchsize 1 --max_epochs 3 --accelerator gpu 模型已下载到本地。出现错误： ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /home/lijian/data/workspace/nlp_task/multimode_generation/Fengshenbang-LM/fengshen/examples/fine │ │ tune_taiyi_stable_diffusion/finetune.py:188 in │ │ │ │ 185 │ │ │ │ │ │ │ │ │ │ │ lr_monitor, │ │ 186 │ │ │ │ │ │ │ │ │ │ │ checkpoint_callback]) │ │ 187 │ │ │ ❱ 188 │ model = StableDiffusion(args) │ │ 189 │ tokenizer = model.tokenizer │ │ 190 │ datasets = load_data(args, global_rank=trainer.global_rank) │ │ 191 │ collate_fn = Collator(args, tokenizer) │ │ │ │ /home/lijian/data/workspace/nlp_task/multimode_generation/Fengshenbang-LM/fengshen/examples/fine │ │ tune_taiyi_stable_diffusion/finetune.py:77 in init │ │ │ │ 74 │ │ │ 75 │ def init(self, args): │ │ 76 │ │ super().init() │ │ ❱ 77 │ │ self.tokenizer = BertTokenizer.from_pretrained( │ │ 78 │ │ │ args.model_path, subfolder="tokenizer") │ │ 79 │ │ self.text_encoder = BertModel.from_pretrained( │ │ 80 │ │ │ args.model_path, subfolder="text_encoder") # load from taiyi_finetune-v0 │ │ │ │ /home/lijian/anaconda3/envs/ldm_env/lib/python3.8/site-packages/transformers/tokenizationutils │ │ base.py:1784 in from_pretrained │ │ │ │ 1781 │ │ │ else: │ │ 1782 │ │ │ │ logger.info(f"loading file {file_path} from cache at {resolved_vocab_fil │ │ 1783 │ │ │ │ ❱ 1784 │ │ return cls._from_pretrained( │ │ 1785 │ │ │ resolved_vocab_files, │ │ 1786 │ │ │ pretrained_model_name_or_path, │ │ 1787 │ │ │ init_configuration, │ │ │ │ /home/lijian/anaconda3/envs/ldm_env/lib/python3.8/site-packages/transformers/tokenizationutils │ │ base.py:1929 in _from_pretrained │ │ │ │ 1926 │ │ │ │ 1927 │ │ # Instantiate tokenizer. │ │ 1928 │ │ try: │ │ ❱ 1929 │ │ │ tokenizer = cls(*init_inputs, init_kwargs) │ │ 1930 │ │ except OSError: │ │ 1931 │ │ │ raise OSError( │ │ 1932 │ │ │ │ "Unable to load vocabulary from file. " │ │ │ │ /home/lijian/anaconda3/envs/ldm_env/lib/python3.8/site-packages/transformers/models/bert/tokeniz │ │ ation_bert.py:193 in init │ │ │ │ 190 │ │ │ kwargs, │ │ 191 │ │ ) │ │ 192 │ │ │ │ ❱ 193 │ │ if not os.path.isfile(vocab_file): │ │ 194 │ │ │ raise ValueError( │ │ 195 │ │ │ │ f"Can't find a vocabulary file at path '{vocab_file}'. To load the vocab │ │ 196 │ │ │ │ "model use `tokenizer = BertTokenizer.from_pretrained(PRETRAINED_MODEL_N │ │ │ │ /home/lijian/anaconda3/envs/ldm_env/lib/python3.8/genericpath.py:30 in isfile │ │ │ │ 27 def isfile(path): │ │ 28 │ """Test whether a path is a regular file""" │ │ 29 │ try: │ │ ❱ 30 │ │ st = os.stat(path) │ │ 31 │ except (OSError, ValueError): │ │ 32 │ │ return False │ │ 33 │ return stat.S_ISREG(st.st_mode) │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

carlfu127 commented 1 year ago

我也是啊，解决了吗？

tangdong1994 commented 1 year ago

可能是有路径没设置，我也碰到过这个情况；比如我就'--default_root_dir',这个参数没设置，就是按照默认的args应该是没有传入这个参数的，比如message = [ '--datasets_path', your_dataset_path, '--datasets_type', 'txt', '--model_path', your_model_path, '--train_batchsize', train_batch_size, '--accelerator', 'gpu',

'--strategy', 'deepspeed',

'--precision', '16',
'--max_epochs','2',
'--dataloader_workers',str(os.cpu_count()),
'--default_root_dir',your_model_path

]

args = args_parser.parse_args(args=message) '--default_root_dir',your_model_path这一句，设置了我自己的路径，所以建议你看一下你这个是哪一步出了问题，定位到之后，补上你自己的路径，应该就可以

sunxiaoyu12 commented 1 year ago

请问你有config.json文件吗？

IDEA-CCNL / Fengshenbang-LM

将IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1模型下载到本地，命令运行finetune.py，出现错误：TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType #261

'--strategy', 'deepspeed',