Closed madroidmaq closed 2 months ago
I uploaded the example data from lora/data to mlx-community/wikisql and specified the dataset id of HuggingFace via the data parameter.
data
Of course, the data field still supports local dataset paths, and will only use the HuggingFace dataset logic if the local path does not exist.
[!note] The splitting rules for HuggingFace datasets are similar to the logic of local folders, requiring at least two data splits: train and valid.
train
valid
The example is as follows:
mlx_lm.lora --model mlx-community/Qwen2.5-0.5B-Instruct-4bit \ --data mlx-community/wikisql \ --train \ --learning-rate 1e-5 \ --lora-layers 12 \ --batch-size 10 \ --max-seq-length 16384 \ --iters 1000
logs
Loading pretrained model Fetching 9 files: 100%|██████████████████████| 9/9 [00:00<00:00, 119837.26it/s] Loading datasets Downloading data: 100%|████████████████████████████████████| 74.0k/74.0k [00:00<00:00, 75.2kB/s] Downloading data: 100%|████████████████████████████████████| 10.2k/10.2k [00:00<00:00, 17.8kB/s] Downloading data: 100%|█████████████████████████████████████| 10.3k/10.3k [00:00<00:00, 12.9kB/s] Generating train split: 100%|████████████████████████| 1000/1000 [00:00<00:00, 55431.82 examples/s] Generating valid split: 100%|█████████████████████████| 100/100 [00:00<00:00, 109169.81 examples/s] Generating test split: 100%|██████████████████████████| 100/100 [00:00<00:00, 136979.23 examples/s] Training Trainable parameters: 0.055% (0.270M/494.005M) Starting training..., iters: 1000 Iter 1: Val loss 2.936, Val took 1.428s
I uploaded the example data from lora/data to mlx-community/wikisql and specified the dataset id of HuggingFace via the
data
parameter.Of course, the data field still supports local dataset paths, and will only use the HuggingFace dataset logic if the local path does not exist.
The example is as follows:
logs