PaddlePaddle / PaddleHub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固,暂停交互,请耐心等待】
https://www.paddlepaddle.org.cn/hub
Apache License 2.0
12.67k stars 2.08k forks source link

PaddleHub / demo / sequence_labeling / train.py 问题(tensor with shape 不对) #1474

Open mock1ddd opened 3 years ago

mock1ddd commented 3 years ago

你好,我在使用 PaddleHub / demo / sequence_labeling / train.py 文件时,加载了一个自定义的数据集,其中有 17 个标签,但是在做 sequnce 问题时,好像没有地方输入类别数量,导致总是显示 tenser 不对。

`# 选择所需要的模型,获取对应的tokenizer import paddlehub as hub model = hub.Module(name='ernie_tiny', task='token-cls', label_map=MyDataset.label_map) tokenizer = model.get_tokenizer()

实例化训练集

train_dataset = MyDataset(tokenizer) test_dataset = MyDataset(tokenizer, mode='test')

optimizer = paddle.optimizer.AdamW(learning_rate=args.learning_rate, parameters=model.parameters()) trainer = hub.Trainer(model, optimizer, checkpoint_dir=args.checkpoint_dir, use_gpu=args.use_gpu) trainer.train( train_dataset, epochs=args.num_epoch, batch_size=args.batch_size, eval_dataset=test_dataset, save_interval=args.save_interval, )`

其中,MyDataset.label_map 是 17 个类别。但是最后总是报错,说 tensor 不对,这个地方要去哪里修改呢?不应该是自动修改的么?

File "hub_train.py", line 90, in <module> save_interval=args.save_interval, File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlehub/finetune/trainer.py", line 199, in train self.optimizer_step(self.current_epoch, batch_idx, self.optimizer, loss) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlehub/finetune/trainer.py", line 378, in optimizer_step self.optimizer.step() File "<decorator-gen-198>", line 2, in step File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/base.py", line 260, in __impl__ return func(*args, **kwargs) File "<decorator-gen-196>", line 2, in step File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__ return wrapped_func(*args, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 225, in __impl__ return func(*args, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/adam.py", line 367, in step loss=None, startup_program=None, params_grads=params_grads) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/optimizer.py", line 775, in _apply_optimize optimize_ops = self._create_optimization_pass(params_grads) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/adamw.py", line 203, in _create_optimization_pass AdamW, self)._create_optimization_pass(parameters_and_grads) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/optimizer.py", line 597, in _create_optimization_pass [p[0] for p in parameters_and_grads if p[0].trainable]) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/adam.py", line 249, in _create_accumulators self._add_moments_pows(p) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/adam.py", line 216, in _add_moments_pows self._add_accumulator(self._moment1_acc_str, p, dtype=acc_dtype) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/optimizer.py", line 516, in _add_accumulator var.set_value(self._accumulators_holder[var_name]) File "<decorator-gen-113>", line 2, in set_value File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__ return wrapped_func(*args, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 225, in __impl__ return func(*args, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/varbase_patch_methods.py", line 125, in set_value self.name, self_tensor_np.shape, value_np.shape)

KPatr1ck commented 3 years ago

你好,关于自定义数据集的类别数问题,可以参考序列标注demo中的label_listlabel_map的定义: https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.1/demo/sequence_labeling#%E4%BB%A3%E7%A0%81%E6%AD%A5%E9%AA%A4