Closed yuxinxu77 closed 7 months ago
If save_sentence_transformers_lib
isn't necessary for future work, then we can skip the following part. But just in case somebody is curious about the error, here are some info:
OSError: No such device (os error 19)
File
File /databricks/python/lib/python3.10/site-packages/transformers/trainer.py:2842, in Trainer.save_model(self, output_dir, _internal_call)
2839 self.model_wrapped.save_checkpoint(output_dir)
2841 elif self.args.should_save:
-> 2842 self._save(output_dir)
2844 # Push to the Hub when save_model
is called by the user.
2845 if self.args.push_to_hub and not _internal_call:
File /Workspace/Users/user1/FlagEmbedding/finetune/trainer.py:37, in BiTrainer._save(self, output_dir, state_dict) 35 # save the checkpoint for sentence-transformers library 36 if self.is_world_process_zero(): ---> 37 save_ckpt_for_sentence_transformers(output_dir, 38 pooling_mode=self.args.sentence_pooling_method, 39 normlized=self.args.normlized)
File /Workspace/Users/user1/FlagEmbedding/finetune/trainer.py:7, in save_ckpt_for_sentence_transformers(ckpt_dir, pooling_mode, normlized) 5 def save_ckpt_for_sentence_transformers(ckpt_dir, pooling_mode: str = 'cls', normlized: bool=True): 6 print(f"ckpt_dir: {ckpt_dir}") ----> 7 word_embedding_model = models.Transformer(ckpt_dir) 8 pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(), pooling_mode=pooling_mode) 9 if normlized:
File /databricks/python/lib/python3.10/site-packages/sentence_transformers/models/Transformer.py:29, in Transformer.init(self, model_name_or_path, max_seq_length, model_args, cache_dir, tokenizer_args, do_lower_case, tokenizer_name_or_path) 26 self.do_lower_case = do_lower_case 28 config = AutoConfig.from_pretrained(model_name_or_path, model_args, cache_dir=cache_dir) ---> 29 self._load_model(model_name_or_path, config, cache_dir) 31 self.tokenizer = AutoTokenizer.from_pretrained(tokenizer_name_or_path if tokenizer_name_or_path is not None else model_name_or_path, cache_dir=cache_dir, tokenizer_args) 33 #No max_seq_length set. Try to infer from model
File /databricks/python/lib/python3.10/site-packages/sentence_transformers/models/Transformer.py:49, in Transformer._load_model(self, model_name_or_path, config, cache_dir) 47 self._load_t5_model(model_name_or_path, config, cache_dir) 48 else: ---> 49 self.auto_model = AutoModel.from_pretrained(model_name_or_path, config=config, cache_dir=cache_dir)
File /databricks/python/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:566, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, *kwargs) 564 elif type(config) in cls._model_mapping.keys(): 565 model_class = _get_model_class(config, cls._model_mapping) --> 566 return model_class.from_pretrained( 567 pretrained_model_name_or_path, model_args, config=config, hub_kwargs, kwargs 568 ) 569 raise ValueError( 570 f"Unrecognized configuration class {config.class} for this kind of AutoModel: {cls.name}.\n" 571 f"Model type should be one of {', '.join(c.name for c in cls._model_mapping.keys())}." 572 )
File /databricks/python_shell/dbruntime/huggingface_patches/transformers.py:21, in _create_patch_function.
File /databricks/python/lib/python3.10/site-packages/transformers/modeling_utils.py:3359, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs) 3339 resolved_archive_file, sharded_metadata = get_checkpoint_shard_files( 3340 pretrained_model_name_or_path, 3341 resolved_archive_file, (...) 3351 _commit_hash=commit_hash, 3352 ) 3354 if ( 3355 is_safetensors_available() 3356 and isinstance(resolved_archive_file, str) 3357 and resolved_archive_file.endswith(".safetensors") 3358 ): -> 3359 with safe_open(resolved_archive_file, framework="pt") as f: 3360 metadata = f.metadata() 3362 if metadata.get("format") == "pt":
The error is quite insteresting...Since the error occurs inside transformers
' AutoModel
class, I directly import transformers.AutoModel
and call the from_pretrained
method. It works flawlessly. Not sure why it's having a problem when called by the sentence-transformers
package.
@yuxinxu77 , save_ckpt_for_sentence_transformers
is used to convert the model into the format of sentence_transformers. In this way, users can load the fine-tuned model with sentence_transformers. If you don't use sentence_transformers, you can skip this step.
We haven't met this error. You can try to change the version of sentence_transformers(we use version 2.6.0)
Thank you for the clarification! Now I can proceed with no worry.
Can someone explain why we need to save the checkpoint for sentence transformers? Since we are already saving the model as a safetensors file inside the output directory, what's the point of saving the checkpoints in the sentence-transformers way?
Reason I asked: When I finetune the embedding model and try to save using
trainer.save_model()
, it is able to save the model into safetensors file, but thesave_sentence_transformers_lib
function at the end of the_save
method always throws an error.Thank you!