Closed briane412 closed 5 months ago
Hello,
Apologies for these errors, I do not always update the notebook when some changes are made to the library.
The error is occurring because the data collator is not providing labels to the model so that it can compute loss, so the trainer has no loss to optimize the model.
This can be fixed by changing the line when creating the collator to:
collator = DataCollator(tokenizer["PAD_None"], copy_inputs_as_labels=True)
I'll try to run the notebook to make sure everything is working.
Thank you for the suggested changes. Do you want to add them to the repo as a contribution? Otherwise I'll add them promptly.
Ah! Thanks. That appears to be working now. I would be happy to submit a contribution with the changes.
Wonderful! And I would be happy to merge it!
I just ran everything, and indeed there are a few other changes to make. Here is a notebook version with everything working and a few comments updated Example_HuggingFace_Mistral_Transformer.ipynb.zip You can include all these changes in the PR 🙌
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
I am hoping to learn how to use miditok to process midi files for use with pytorch but I have not been able to get the Example_HuggingFace_Mistral_Transformer.ipynb notebook to complete successfully. I followed these steps:
midi_paths = list(Path("Maestro").glob("**/*.mid")) + list(Path("Maestro").glob("**/*.midi"))
tomidi_paths = list(Path("Maestro").resolve().glob("**/*.mid")) + list(Path("Maestro").resolve().glob("**/*.midi"))
for files_paths, subset_name in { (midi_paths_train, "train"), (midi_paths_valid, "valid"), (midi_paths_test, "test") }:
tofor files_paths, subset_name in [ (midi_paths_train, "train"), (midi_paths_valid, "valid"), (midi_paths_test, "test") ]:
ValueError: The model did not return a loss from the inputs, only the following keys: logits. For reference, the inputs it received are input_ids,attention_mask.
The full trace of the error is: `--------------------------------------------------------------------------- ValueError Traceback (most recent call last) in <cell line: 82>()
80
81 # Training
---> 82 train_result = trainer.train()
83 trainer.save_model() # Saves the tokenizer too
84 trainer.log_metrics("train", train_result.metrics)
3 frames /usr/local/lib/python3.10/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs) 1857 hf_hub_utils.enable_progress_bars() 1858 else: -> 1859 return inner_training_loop( 1860 args=args, 1861 resume_from_checkpoint=resume_from_checkpoint,
/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval) 2201 2202 with self.accelerator.accumulate(model): -> 2203 tr_loss_step = self.training_step(model, inputs) 2204 2205 if (
/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in training_step(self, model, inputs) 3136 3137 with self.compute_loss_context_manager(): -> 3138 loss = self.compute_loss(model, inputs) 3139 3140 if self.args.n_gpu > 1:
/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs) 3177 else: 3178 if isinstance(outputs, dict) and "loss" not in outputs: -> 3179 raise ValueError( 3180 "The model did not return a loss from the inputs, only the following keys: " 3181 f"{','.join(outputs.keys())}. For reference, the inputs it received are {','.join(inputs.keys())}."
ValueError: The model did not return a loss from the inputs, only the following keys: logits. For reference, the inputs it received are input_ids,attention_mask.`