Loading pre-trained model with set vocab file

Jordy-VL commented 3 years ago

When loading a pre-trained model I get stuck on a pre-defined loading of the vocabulary file which is set to an absolute path inside the model... How can I fix this [without making the absolute path as well]?

import sys
import torch 
sys.path.append("/home/jordy/code/SP-calibration-NER/ACE")
import flair

path = "/home/jordy/code/SP-calibration-NER/ACE/resources/taggers/en-xlmr-tuned-first_elmo_bert-old-four_multi-bert-four_word-glove_word_origflair_mflair_char_30episode_150epoch_32batch_0.1lr_800hidden_en_monolingual_crf_fast_reinforce_freeze_norelearn_sentbatch_0.5discount_0.9momentum_5patience_nodev_newner5/best-model.pt"
model = torch.load(path, map_location=torch.device('cpu'))

ERROR: """ loader.py Traceback (most recent call last): File "loader.py", line 11, in model = torch.load(path, map_location=torch.device('cpu')) File "/home/jordy/.virtualenvs/ACE/lib/python3.6/site-packages/torch/serialization.py", line 426, in load return _load(f, map_location, pickle_module, **pickle_load_args) File "/home/jordy/.virtualenvs/ACE/lib/python3.6/site-packages/torch/serialization.py", line 613, in _load result = unpickler.load() File "/home/jordy/.virtualenvs/ACE/lib/python3.6/site-packages/transformers/tokenization_xlm_roberta.py", line 175, in setstate self.sp_model.Load(self.vocab_file) File "/home/jordy/.virtualenvs/ACE/lib/python3.6/site-packages/sentencepiece.py", line 118, in Load return _sentencepiece.SentencePieceProcessor_Load(self, filename) OSError: Not found: "/home/yongjiang.jy/.flair/embeddings/xlm-roberta-large-finetuned-conll03-english/sentencepiece.bpe.model": No such file or directory Error #2 """

wangxinyu0922 commented 3 years ago

Hi, the model cannot be loaded directly from torch.load as the model must be created following the config file (conll_03_english.yaml here). Please load the model following the guide of Pretrained Models and Parse Files to use our model.

Jordy-VL commented 3 years ago

That is what I originally did, which is why I wrote this unit test.

I changed the config to the right paths (replacing home). See it attached at the end.

Then I run: python train.py --config config/conll_03_english.yaml --test

At these lines #for test, it goes wrong:

I get the same error nevertheless...

Config:

Controller:
  model_structure: null
ReinforcementTrainer:
  controller_learning_rate: 0.1
  controller_optimizer: SGD
  distill_mode: false
  optimizer: SGD
  sentence_level_batch: true
embeddings:
  TransformerWordEmbeddings-1:
    model: bert-base-cased
    layers: -1,-2,-3,-4
    pooling_operation: mean
    embedding_name: /home/jordy/.cache/torch/transformers/bert-base-cased
  TransformerWordEmbeddings-2:
    model: bert-base-multilingual-cased
    layers: -1,-2,-3,-4
    pooling_operation: mean
  ELMoEmbeddings-0:
    model: original
    # options_file: elmo_2x4096_512_2048cnn_2xhighway_options.json
    # weight_file: elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5
  FastCharacterEmbeddings:
    char_embedding_dim: 25
    hidden_size_char: 25
  FastWordEmbeddings-0:
    embeddings: glove
    freeze: true
  FastWordEmbeddings-1:
    embeddings: en
    freeze: true
  FlairEmbeddings-0:
    model: en-forward
  FlairEmbeddings-1:
    model: en-backward
  FlairEmbeddings-2:
    model: multi-forward
  FlairEmbeddings-3:
    model: multi-backward
  TransformerWordEmbeddings-0: 
    layers: '-1'
    pooling_operation: first
    model: xlm-roberta-large-finetuned-conll03-english
    embedding_name: /home/jordy/.flair/embeddings/xlm-roberta-large-finetuned-conll03-english
interpolation: 0.5
model:
  FastSequenceTagger:
    crf_attention: false
    dropout: 0.0
    hidden_size: 800
    sentence_loss: true
    use_crf: true
model_name: en-xlmr-tuned-first_elmo_bert-old-four_multi-bert-four_word-glove_word_origflair_mflair_char_30episode_150epoch_32batch_0.1lr_800hidden_en_monolingual_crf_fast_reinforce_freeze_norelearn_sentbatch_0.5discount_0.9momentum_5patience_nodev_newner5
ner:
  Corpus: CONLL_03
  tag_dictionary: resources/taggers/ner_tags.pkl
target_dir: resources/taggers/
targets: ner
train:
  controller_momentum: 0.9
  discount: 0.5
  learning_rate: 0.1
  max_episodes: 30
  max_epochs: 150
  max_epochs_without_improvement: 25
  mini_batch_size: 32
  monitor_test: false
  patience: 5
  save_final_model: false
  train_with_dev: false
  true_reshuffle: false
trainer: ReinforcementTrainer

ast:
  Corpus: SEMEVAL16-TR:SEMEVAL16-ES:SEMEVAL16-NL:SEMEVAL16-EN:SEMEVAL16-RU
atis:
  Corpus: ATIS-EN:ATIS-TR:ATIS-HI
chunk:
  Corpus: CONLL_03:CONLL_03_GERMAN

wangxinyu0922 commented 3 years ago

That is what I originally did, which is why I wrote this unit test.

I changed the config to the right paths (replacing home). See it attached at the end.

Then I run: python train.py --config config/conll_03_english.yaml --test

At these lines #for test, it goes wrong:

I get the same error nevertheless...

Config:

Controller:
  model_structure: null
ReinforcementTrainer:
  controller_learning_rate: 0.1
  controller_optimizer: SGD
  distill_mode: false
  optimizer: SGD
  sentence_level_batch: true
embeddings:
  TransformerWordEmbeddings-1:
    model: bert-base-cased
    layers: -1,-2,-3,-4
    pooling_operation: mean
    embedding_name: /home/jordy/.cache/torch/transformers/bert-base-cased
  TransformerWordEmbeddings-2:
    model: bert-base-multilingual-cased
    layers: -1,-2,-3,-4
    pooling_operation: mean
  ELMoEmbeddings-0:
    model: original
    # options_file: elmo_2x4096_512_2048cnn_2xhighway_options.json
    # weight_file: elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5
  FastCharacterEmbeddings:
    char_embedding_dim: 25
    hidden_size_char: 25
  FastWordEmbeddings-0:
    embeddings: glove
    freeze: true
  FastWordEmbeddings-1:
    embeddings: en
    freeze: true
  FlairEmbeddings-0:
    model: en-forward
  FlairEmbeddings-1:
    model: en-backward
  FlairEmbeddings-2:
    model: multi-forward
  FlairEmbeddings-3:
    model: multi-backward
  TransformerWordEmbeddings-0: 
    layers: '-1'
    pooling_operation: first
    model: xlm-roberta-large-finetuned-conll03-english
    embedding_name: /home/jordy/.flair/embeddings/xlm-roberta-large-finetuned-conll03-english
interpolation: 0.5
model:
  FastSequenceTagger:
    crf_attention: false
    dropout: 0.0
    hidden_size: 800
    sentence_loss: true
    use_crf: true
model_name: en-xlmr-tuned-first_elmo_bert-old-four_multi-bert-four_word-glove_word_origflair_mflair_char_30episode_150epoch_32batch_0.1lr_800hidden_en_monolingual_crf_fast_reinforce_freeze_norelearn_sentbatch_0.5discount_0.9momentum_5patience_nodev_newner5
ner:
  Corpus: CONLL_03
  tag_dictionary: resources/taggers/ner_tags.pkl
target_dir: resources/taggers/
targets: ner
train:
  controller_momentum: 0.9
  discount: 0.5
  learning_rate: 0.1
  max_episodes: 30
  max_epochs: 150
  max_epochs_without_improvement: 25
  mini_batch_size: 32
  monitor_test: false
  patience: 5
  save_final_model: false
  train_with_dev: false
  true_reshuffle: false
trainer: ReinforcementTrainer

ast:
  Corpus: SEMEVAL16-TR:SEMEVAL16-ES:SEMEVAL16-NL:SEMEVAL16-EN:SEMEVAL16-RU
atis:
  Corpus: ATIS-EN:ATIS-TR:ATIS-HI
chunk:
  Corpus: CONLL_03:CONLL_03_GERMAN

In fact, you do not need to modify embedding_name in the config as the code will read the path according to model in each embedding. It is strange the reporting error is identical to torch.load(). Can you show me the error when running the command?

 python train.py --config config/conll_03_english.yaml --test

wangxinyu0922 commented 3 years ago

I tried to run the model on another server and found the same problem as yours. The problem comes from the xlm-roberta-large-finetuned-conll03-english model. I will try to change the checkpoint of the model. Currently, I suggest you may create the path /home/yongjiang.jy/.flair/embeddings/xlm-roberta-large-finetuned-conll03-english/ (you may need to use sudo operations to create the path) and link your xlm-roberta-large-finetuned-conll03-english to the path.

Jordy-VL commented 3 years ago

With the suggested symlink, the testing now continues. =)

However, in the rest of the output, I still get a lot of logging referring to your home directory.

2021-03-16 17:41:06,401 loading file resources/taggers/en-xlmr-tuned-first_elmo_bert-old-four_multi-bert-four_word-glove_word_origflair_mflair_char_30episode_150epoch_32batch_0.1lr_800hidden_en_monolingual_crf_fast_reinforce_freeze_norelearn_sentbatch_0.5discount_0.9momentum_5patience_nodev_newner5/best-model.pt
2021-03-16 17:41:17,560 Testing using best model ...
2021-03-16 17:41:17,726 Setting embedding mask to the best action: tensor([1., 0., 0., 0., 1., 1., 0., 1., 1., 1., 1.], device='cuda:0')
['/home/jordy/.cache/torch/transformers/bert-base-cased', '/home/jordy/.flair/embeddings/lm-jw300-backward-v0.1.pt', '/home/jordy/.flair/embeddings/lm-jw300-forward-v0.1.pt', '/home/jordy/.flair/embeddings/news-backward-0.4.1.pt', '/home/jordy/.flair/embeddings/news-forward-0.4.1.pt', '/home/jordy/.flair/embeddings/xlm-roberta-large-finetuned-conll03-english', 'Char', 'Word: en', 'Word: glove', 'bert-base-multilingual-cased', 'elmo-original']
2021-03-16 17:41:20,927 /home/yongjiang.jy/.cache/torch/transformers/bert-base-cased 108310272
2021-03-16 17:46:00,954 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt 43087046
2021-03-16 17:46:00,962 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt is not selected, Skipping
2021-03-16 17:46:00,963 /home/yongjiang.jy/.flair/embeddings/lm-jw300-forward-v0.1.pt 43087046
2021-03-16 17:46:00,964 /home/yongjiang.jy/.flair/embeddings/lm-jw300-forward-v0.1.pt is not selected, Skipping
2021-03-16 17:46:00,964 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt 18257500
2021-03-16 17:46:00,965 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt is not selected, Skipping
2021-03-16 17:46:00,966 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt 18257500
2021-03-16 17:46:15,031 /home/yongjiang.jy/.flair/embeddings/xlm-roberta-large-finetuned-conll03-english 559890432
2021-03-16 17:47:58,928 bert-base-multilingual-cased 177853440

Jordy-VL commented 3 years ago

It works! However, I get memory issues after 80 steps... I tested both on GPU (8GB) and CPU (32GB). How much memory do you estimate is needed for testing this model?

2021-03-16 18:19:22,365 Finished Embeddings Assignments
2021-03-16 18:19:24,697 10/109
2021-03-16 18:19:26,553 20/109
2021-03-16 18:19:29,251 30/109
2021-03-16 18:19:32,265 40/109
2021-03-16 18:19:35,052 50/109
2021-03-16 18:19:37,428 60/109
2021-03-16 18:19:40,866 70/109
2021-03-16 18:20:10,978 80/109
Killed

wangxinyu0922 commented 3 years ago

I run the code on a 12GB GPU server. You can set batch_size when testing:

 python train.py --config config/conll_03_english.yaml --test --batch_size 1

wangxinyu0922 commented 3 years ago

I have updated the model on the Onedrive. The model read the xlm-roberta-large-finetuned-conll03-english model at resources/xlm-roberta-large-finetuned-conll03-english now.

Jordy-VL commented 3 years ago

Thanks for the quick responses. t worked for me!

One final question to follow-up on this:

The final output show these stats:

2021-03-17 12:31:36,849 0.9698  0.9736  0.9717
2021-03-17 12:31:36,849 
MICRO_AVG: acc 0.945 - f1-score 0.9717
MACRO_AVG: acc 0.9348 - f1-score 0.9657499999999999
LOC        tp: 1807 - fp: 30 - fn: 30 - tn: 1807 - precision: 0.9837 - recall: 0.9837 - accuracy: 0.9679 - f1-score: 0.9837
MISC       tp: 859 - fp: 67 - fn: 63 - tn: 859 - precision: 0.9276 - recall: 0.9317 - accuracy: 0.8686 - f1-score: 0.9296
ORG        tp: 1295 - fp: 58 - fn: 46 - tn: 1295 - precision: 0.9571 - recall: 0.9657 - accuracy: 0.9257 - f1-score: 0.9614
PER        tp: 1824 - fp: 25 - fn: 18 - tn: 1824 - precision: 0.9865 - recall: 0.9902 - accuracy: 0.9770 - f1-score: 0.9883

Your paper mentions reaching 93.6 (micro)-F1 at sentence-level. This seems lower than the stats reported here. Are you evaluating here on the dev set (testa/b?) of ConLL? How can I switch to the other set?

Thanks in advance!

wangxinyu0922 commented 3 years ago

My stats are:

2021-03-17 20:14:21,767 0.9309  0.9423  0.9366
2021-03-17 20:14:21,767
MICRO_AVG: acc 0.8807 - f1-score 0.9366
MACRO_AVG: acc 0.8635 - f1-score 0.9247500000000001
LOC        tp: 1580 - fp: 90 - fn: 88 - tn: 1580 - precision: 0.9461 - recall: 0.9472 - accuracy: 0.8987 - f1-score: 0.9466
MISC       tp: 606 - fp: 115 - fn: 96 - tn: 606 - precision: 0.8405 - recall: 0.8632 - accuracy: 0.7417 - f1-score: 0.8517
ORG        tp: 1561 - fp: 159 - fn: 100 - tn: 1561 - precision: 0.9076 - recall: 0.9398 - accuracy: 0.8577 - f1-score: 0.9234
PER        tp: 1575 - fp: 31 - fn: 42 - tn: 1575 - precision: 0.9807 - recall: 0.9740 - accuracy: 0.9557 - f1-score: 0.9773

Your reported F1 score is identical to the dev set F1 in my experiment. It seems that you evaluate the model on the dev set (testa).

AtharvanDogra commented 2 years ago

I have updated the model on the Onedrive. The model read the xlm-roberta-large-finetuned-conll03-english model at resources/xlm-roberta-large-finetuned-conll03-english now.

Where am I going wrong in this ? I havent added the xlm-roberta-large-finetuned-conll03-english anywhere, as the instruction says the models are downloaded on their own. I've just downloaded the conll_en_ner_model.zip from onedrive and extracted it in resources/taggers @wangxinyu0922 @Jordy-VL

Alibaba-NLP / ACE

Loading pre-trained model with set vocab file #8