"Step must be 1" in DPODataCollatorWithPadding

When I use the alignment handbook to run dpo (https://github.com/huggingface/alignment-handbook/blob/main/scripts/run_dpo.py) while loading a proprietary on-disk dataset stored in .jsonl format like so

raw_datasets = DatasetDict()
cache_dir = training_args.output_dir
if data_args.train_path is not None:
    raw_datasets["train"] = load_dataset("json", data_files=data_args.train_path, streaming=False, cache_dir=cache_dir)["train"].set_format(type="python")

instead of doing its the default "get_datasets()"

raw_datasets = get_datasets(data_args, splits=data_args.dataset_splits)

I encounter an error here in trl/trainer/utils/DPODataCollatorWithPadding where it expects a list for each feature rather than a tensor. The error is "step must be 1", which makes sense because PyTorch doesn't support negative step values in slicing, unlike Python lists or NumPy arrays.

if "prompt" in k:
    to_pad = [torch.LongTensor(ex[k][::-1]) for ex in features]

I fixed this by using torch.flip to reverse the elements if they are in a tensor rather than a list

to_pad = [torch.LongTensor(ex[k][::-1]) if isinstance(ex[k], list) else torch.LongTensor(torch.flip(ex[k], [0]))  for ex in features]

I guess my question is that somewhere along the way, things were getting converted to tensors and I don't know exactly where and whether this affects anything else

huggingface / trl

"Step must be 1" in DPODataCollatorWithPadding #1304