Closed toutoutout closed 6 months ago
Hi,
If you are training from scratch under the Joint paradigm, the configuration file used for training is joint_config.py
. Please check if your settings (tagger_classes-line 40) are consistent with those in the repository.
-->MAX_GENERATE is set to 6 in run_stg_joint.sh -->model_args.add_arg('max_generate', int, 6, 'Number of Max Token Generation') in evaluate_joint_config.py -->model_args.add_arg('max_generate', int, 6, 'Number of Max Token Generation') in joint_config.py -->model_args.add_arg('max_generate', int, 6, 'Number of Max Token Generation') in generator_config.py -->model_args.add_arg('max_generate', int, 6, 'Number of Max Token Generation') in evaluate_indep_config.py -->model_args.add_arg('max_generate', int, 6, 'Number of Max Token Generation') in tagger_config.py -->self.max_token = args.max_generate + 1 --- line 16 in tagger_model(else if self.max_token = args.max_generate, the error will be size mismatch for tagger._hidden2tag.linear.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([6, 768]). size mismatch for tagger._hidden2tag.linear.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([6]). size mismatch for tagger._hidden2t.linear.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([8, 768]). size mismatch for tagger._hidden2t.linear.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([8]).)
For error message:
size mismatch for tagger._hidden2tag.linear.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([6]).
You need to check the information in the conversation above:
Hi,
If you are training from scratch under the Joint paradigm, the configuration file used for training is
joint_config.py
. Please check if your settings (tagger_classes-line 40) are consistent with those in the repository.
The tagger_classes parameter you trained with also seems to be set to 7 during training. You may need to check or retrain it.
-->MAX_GENERATE is set to 6 in run_stg_joint.sh -->model_args.add_arg('max_generate', int, 6, 'Number of Max Token Generation') in evaluate_joint_config.py -->model_args.add_arg('max_generate', int, 6, 'Number of Max Token Generation') in joint_config.py -->model_args.add_arg('max_generate', int, 6, 'Number of Max Token Generation') in generator_config.py -->model_args.add_arg('max_generate', int, 6, 'Number of Max Token Generation') in evaluate_indep_config.py -->model_args.add_arg('max_generate', int, 6, 'Number of Max Token Generation') in tagger_config.py -->self.max_token = args.max_generate + 1 --- line 16 in tagger_model(else if self.max_token = args.max_generate, the error will be size mismatch for tagger._hidden2tag.linear.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([6, 768]). size mismatch for tagger._hidden2tag.linear.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([6]). size mismatch for tagger._hidden2t.linear.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([8, 768]). size mismatch for tagger._hidden2t.linear.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([8]).)
It's a bit odd; if max_generate is set to 6 in evaluate_joint_config.py, the current model's shape should not be 8, it should be 7. You may need to track the value of max_token in TaggerModel.
if the max_generate parameter in evaluate_joint_config.py set to 7, the error will be : size mismatch for tagger._hidden2tag.linear.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([6, 768]). size mismatch for tagger._hidden2tag.linear.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([6]). size mismatch for tagger._hidden2t.linear.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([8, 768]). size mismatch for tagger._hidden2t.linear.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([8]).
I'm sure the max_generate was set to 6 in joint_config.py (tagger_classes -line 40), since i set the same 6 in run_stg_joint.sh.
Yes. i need to track the value.
Try setting the parameters I just sent you in the email, and also print out the max_token of the TaggerModel for tracking.
Thanks a lot
Hello,
I trained the model with run_stg_joint.sh. After i run demo_pipeline.py, while receving an error. I paste the full error here:
[jupyter@jupyter-d134f2d8-ead8-4a47-92a0-8dcb5293de93-54b4cfd585-jh5dl STG-correction]$ python demo_pipeline.py jieba are not installed, use default mode. Some weights of the model checkpoint at ../pretrained-models/hflchinese-roberta-wwm-ext/ were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
please give some hint to solve it. Thanks.