[BUG] Value Error noqa:E501

When I using dpo_llava finetune on custom dpo dataset, after several steps, I run into the following error message

ValueError: The input provided to the model are wrong. The number of image tokens is 3 while the number of image given to the model is 4. This prevents correct indexing and breaks batch generation.

How does that happen since I have checked my dataset in strictly following format

TideDra / VL-RLHF

[BUG] Value Error noqa:E501 #13