Open turboderp opened 1 year ago
I've had time to finetune a new adapter, and I can confirm that training with the EOS token works. Responses now seem to consistently end when they're supposed to.
I did have another question regarding the training data. In the original paper the training examples used two different formats:
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input}
### Response:
Or:
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Response:
With the latter being used whenever the input field of an example is blank. I made this change to train_data.py and it seems to work better like this, but I'm not sure how I'd benchmark it to say for sure.
Does anyone have any idea how much it's expected to matter? Or maybe some idea how to benchmark it?
I trained this on the 13B model and a cleaned Alpaca dataset over the weekend (17 hours on a 48 GB A6000, if anyone's interested).
Inference works well, and the model is surprisingly good at following directions, but it doesn't seem to know when to quit. Most of the time it doesn't seem to output an EOS token at the end of the response and just starts dreaming up more prompts, random YouTube links and stuff like that.
I'm thinking the EOS token wasn't added to the training examples, maybe? I notice there was a change to train_data.py today, but I'm not really sure if adding the **kwargs is addressing that issue or something else. It'd be good to know before committing to another round of training.