LoRA: Support HuggingFace dataset via data parameter

I uploaded the example data from lora/data to mlx-community/wikisql and specified the dataset id of HuggingFace via the data parameter.

Of course, the data field still supports local dataset paths, and will only use the HuggingFace dataset logic if the local path does not exist.

[!note] The splitting rules for HuggingFace datasets are similar to the logic of local folders, requiring at least two data splits: train and valid.

The example is as follows:

mlx_lm.lora --model mlx-community/Qwen2.5-0.5B-Instruct-4bit \
               --data mlx-community/wikisql \
               --train \
               --learning-rate 1e-5 \
               --lora-layers 12 \
               --batch-size 10 \
               --max-seq-length 16384 \
               --iters 1000

logs

Loading pretrained model
Fetching 9 files: 100%|██████████████████████| 9/9 [00:00<00:00, 119837.26it/s]
Loading datasets
Downloading data: 100%|████████████████████████████████████| 74.0k/74.0k [00:00<00:00, 75.2kB/s]
Downloading data: 100%|████████████████████████████████████| 10.2k/10.2k [00:00<00:00, 17.8kB/s]
Downloading data: 100%|█████████████████████████████████████| 10.3k/10.3k [00:00<00:00, 12.9kB/s]
Generating train split: 100%|████████████████████████| 1000/1000 [00:00<00:00, 55431.82 examples/s]
Generating valid split: 100%|█████████████████████████| 100/100 [00:00<00:00, 109169.81 examples/s]
Generating test split: 100%|██████████████████████████| 100/100 [00:00<00:00, 136979.23 examples/s]
Training
Trainable parameters: 0.055% (0.270M/494.005M)
Starting training..., iters: 1000
Iter 1: Val loss 2.936, Val took 1.428s

ml-explore / mlx-examples

LoRA: Support HuggingFace dataset via data parameter #996