allenai / bilm-tf

Tensorflow implementation of contextualized word representations from bi-directional language models
Apache License 2.0
1.62k stars 452 forks source link

How to fine tune #201

Closed pxk8001 closed 5 years ago

pxk8001 commented 5 years ago

I want to fine tune existing models (PubMed weights:elmo_2x4096_512_2048cnn_2xhighway_weights_PubMed_only.hdf5 options:elmo_2x4096_512_2048cnn_2xhighway_options.json)
on our unlabeled data. the official https://github.com/allenai/bilm-tf#how-to-do-fine-tune-a-model-on-additional-unlabeled-data tell me first download the checkpoint files (https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_tf_checkpoint/checkpoint) and other works .finally use the script bin/restart.py to restart training with the existing checkpoint on the new dataset.I did as mentioned above, get some files like
xx.index xx.meta. ... However, PubMed-weights and PubMed-options are not used in this process.So,i want to know ①:The relationship of these files aomng(PubMed-options,PubMed-weights and checkpoint) ②:how to use generate files(xx.index,xx.meta,xx.data-00000-of-00001) to fine-tune or may be generate these files process called files

matt-peters commented 5 years ago

Each model (PubMed, original ELMo) have completely separate weights, options, and checkpoint, so to fine tune PubMed switch all of the file paths to their PubMed equivalent.