Closed DavidePaglieri closed 3 months ago
If you're code gives an error something like
RuntimeError: stack expects each tensor to be equal size, but got [2612] at entry 0 and [2467] at entry 1
It's becuase of the data collator. This pads the each data to equal size, so it should be ok.
Thanks! The other problem compared to llava implementation, is that Phi's processor doesn't work with batches, but the tokenizer alone does.
@DavidePaglieri I've worked with the original Phi3's processor but it works with batch with the dataset and collator I've wrote in the code (except for the label part, because the phi3's original processor dosen't makes the label). I don't know exactly what code are you using, so you could just use the dataset and collator from my dataset adding making labels in the dataset.
Hi, thanks for your great work!
Currently using standard code from transformers I can train Phi-3, but only with batch size of 1. Can I ask specifically what was the change needed to make it work with larger batch sizes?