aaronhaefner / applications-of-llm

Program suite and applications of LLMs
1 stars 0 forks source link

Add Testing with `pytest` #9

Open aaronhaefner opened 3 months ago

aaronhaefner commented 3 months ago

Add Testing that can be run via CI/CD pipelines and pytest. This will ensure that updates to the existing generative pipeline does not change unexpectedly and passes tests it had passed at baseline before the new update.

aaronhaefner commented 2 months ago

Consider boilerplate testing fixtures to start like initializing a model and then generating an output to which you know the answer (and the model shouldn't deviate from this, passing the test each time).

import pytest
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

@pytest.fixture
def model():
    model_name = "your-model-name"
    model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    return model, tokenizer

def test_model_output(model):
    model, tokenizer = model
    input_text = "Translate English to French: How are you?"
    inputs = tokenizer.encode(input_text, return_tensors="pt")
    outputs = model.generate(inputs)
    decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)

    expected_output = "Comment ça va?"
    assert decoded_output == expected_output, f"Expected {expected_output}, but got {decoded_output}"