ml-explore / mlx-examples

Examples in the MLX framework
MIT License
6.29k stars 897 forks source link

[BUG] missing spaces in response with mlx-lm 0.19.1 and 0.19.2 #1073

Closed hschaeufler closed 1 month ago

hschaeufler commented 1 month ago

Describe the bug I have fine-tuned the meta-llama/Meta-Llama-3.1-8B-Instruct model in different variants with mlx-lm and Lora to generate Dart-Unit-Tests. The tuning of the models was done with different versions of mlx-lm (0.19.0, 0.19.1 and 0.19.2). I also used mlx-lm to fuse the adapters to the LLM again. Now I have noticed that during inference/generation with 0.19.1 and 0.19.2 spaces are missing in the generated import-statements when they are followed by a ', for example: import'package:flutter/material.dart'; vs import 'package:flutter/material.dart';. Regardless of which version the model was fine-tuned with, if I downgrade to 0.19.0 the generation works correctly. In the training data the import are ok. Maybe a problem with tokenising?

For example

Generation-Response with 0.19.0

Here are the widget tests for the `ListTile`:

```dart
import 'package:flutter/material.dart';
import 'package:flutter_bloc/flutter_bloc.dart';
import 'package:flutter_test/flutter_test.dart';

Generation-Response with 0.19.1

Here are the tests for the `ListTile` class:

```dart
import'package:flutter/material.dart';
import'package:flutter_bloc/flutter_bloc.dart';
import'package:flutter_test/flutter_test.dart';

Generation-Response with 0.19.1

body:'body',
timestamp: DateTime.now(),

To Reproduce

code snippet used to genrate code

model_path = "results/llama3_1_8B_instruct_lora/tuning_09/lora_fused_model"
model, tokenizer = load(model_path)

generation_config = GenerationConfig.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
generation_args = {
    "temp": generation_config.temperature,
    "top_p": generation_config.top_p,
}

prompt_template = "Generate {test_type}s in Dart for the following {code_type}.\n" \
                  "### Path: {code_file_path}\n" \
                  "### Code:\n" \
                  "{code}\n" \
                  "### Test:\n"

def build_prompt(
        code: str, 
        code_file_path: str, 
        test_type: str, 
        code_type: str
) -> str:
    prompt = prompt_template.format(
        test_type=test_type,
        code_type=code_type,
        code_file_path=code_file_path,
        code=code
    )
    messages = [{"role": "user", "content": prompt}]
    return tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )

def generate_test(code, code_file_path, code_type, test_type) -> tuple[str, str]:
    logging.info(f"Generating test for {os.path.basename(code_file_path)}")
    prompt =  build_prompt(
        code=code, 
        code_file_path=code_file_path, 
        code_type=code_type, 
        test_type=test_type
    )
    print(prompt)

Expected behavior No missing spaces before '.

Desktop (please complete the following information):

awni commented 1 month ago

That's a bug in the detokenization. Fix here: https://github.com/ml-explore/mlx-examples/pull/1072

hschaeufler commented 1 month ago

Many thanks for the fix. That was quick. :)