studio-ousia / luke

LUKE -- Language Understanding with Knowledge-based Embeddings
Apache License 2.0
705 stars 102 forks source link

luke_large_500k Compression #147

Closed taghreed34 closed 2 years ago

taghreed34 commented 2 years ago

Hi, I'm trying to finetune luke-large for NER on conll2003, but due to limited hardware capabilities I have to interrupt training process and save pytorch_model.bin after a while. After saving the resultant file, I unzip luke_large_500k and replace pytorch_model.bin with the new one and run the training script.

The following error is raised RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory.

The problem exactly happens at this line: File "/content/luke/luke/utils/model_utils.py", line 108, in _load state_dict = torch.load(os.path.join(path, model_file), map_location="cpu").

So what's the expected mistake I'm doing that causes this error? Is it the compression method? I use this method to zip and unzip: !tar xvzf /content/luke_large_500k.tar.gz !tar -czvf luke_large_500k.tar.gz pytorch_model.bin metadata.json entity_vocab.tsv

ryokan0123 commented 2 years ago

How about trying the create_model_archive method? https://github.com/studio-ousia/luke/blob/24f213920dcb4a078ae64b544b8cde315c42ff81/luke/utils/model_utils.py#L37

I am not 100% sure, but maybe the create_model_archive method and tar command produce arhive files with different formats.