NormXU / ERNIE-Layout-Pytorch

An unofficial Pytorch implementation of ERNIE-Layout which is originally released through PaddleNLP.
http://arxiv.org/abs/2210.06155
MIT License
99 stars 11 forks source link

not initialized warning for some weights #16

Closed DiddyC closed 1 year ago

DiddyC commented 1 year ago

Hi, I tried following your readme but I'm getting this error: Some weights of ErnieLayoutForTokenClassification were not initialized from the model checkpoint at /opt/omniai/work/instance1/jupyter/hugging_face_models/ernie-layout/torch-model/lockbox_torch/pytorch_model.bin and are newly initialized: ['ernie_layout.visual.backbone.batch_norm1.num_batches_tracked', 'ernie_layout.visual.backbone.resnet.layer0.1.batch_norm2.num_batches_tracked', 'ernie_layout.visual.backbone.resnet.layer0.2.batch_norm3.num_batches_tracked', 'ernie_layout.visual.backbone.resnet.layer0.0.batch_norm1.num_batches_tracked', 'ernie_layout.visual.backbone.resnet.layer0.2.batch_norm2.num_batches_tracked', 'ernie_layout.visual.backbone.resnet.layer0.1.batch_norm3.num_batches_tracked', 'classifier.bias', 'ernie_layout.visual.backbone.resnet.layer0.1.batch_norm1.num_batches_tracked', 'ernie_layout.visual.backbone.resnet.layer0.0.batch_norm3.num_batches_tracked', 'ernie_layout.visual.backbone.resnet.layer0.0.batch_norm2.num_batches_tracked', 'classifier.weight', 'ernie_layout.visual.backbone.resnet.layer0.0.shortcut.1.num_batches_tracked', 'ernie_layout.visual.backbone.resnet.layer0.2.batch_norm1.num_batches_tracked'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

I used your weights to load the model and also tried it on my finetuned model. Any idea how to get around this? Thanks!

logan-markewich commented 1 year ago

@DiddyC I might be wrong, but I think the visual back-bone is supposed to be newly initialized? I remember LayoutLMV2/3 gave a similar warning

Since you fine-tuned it, it should be fine too

DiddyC commented 1 year ago

@logan-markewich thanks! it doesn't sound right to me... Especially if I've already trained it using the paddle logic, I'd want the model to be ready to go after conversion to Torch. Perhaps it's not possible, but if so, what's the point in using a pre trained model?

logan-markewich commented 1 year ago

@DiddyC yea i guess im not sure. well, at least the transformer layers are trained 😅 that's the intensive part

NormXU commented 1 year ago

@DiddyC I was trying to find weights with the name XXXX.num_batches_tracked` but it seems that these weights are missing in the paddle model as well.

PyTorch doesn't have num_batches_tracked in its previous version until certain updates.

According to the Pytorch code, there are two ways to calculate running mean and variance, by exponential moving average or by cumulative moving average with num_batches_tracked. num_batches_tracked (starting from zero) will plus 1 for each mini-batch if self.momentum is None.

Paddle doesn't have such a num_batches_tracked and only supports exponential moving average according to its doc

So I think if you are using the model for fine-tuning downstream tasks, that's fine, just ignore these warnings.

DiddyC commented 1 year ago

@NormXU thanks, it worked well.