microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.39k stars 3.15k forks source link

Python: Unable to pass custom parameters to model services #2594

Closed Chris-hughes10 closed 8 months ago

Chris-hughes10 commented 1 year ago

Describe the bug Parameters are passed to model services using PromptTemplateConfig.CompletionConfig. However, these parameters are very specific to OpenAI models. I am using a custom connector and would like to be able to pass a parameter that is not currently defined.

To Reproduce

Here is a toy example to illustrate this

from typing import List, Optional, Union

import torch
from curated_transformers.generation import (
    AutoGenerator,
    SampleGeneratorConfig,
)
import semantic_kernel as sk
from semantic_kernel.connectors.ai.ai_exception import AIException
from semantic_kernel.connectors.ai.complete_request_settings import (
    CompleteRequestSettings,
)
from semantic_kernel.connectors.ai.text_completion_client_base import (
    TextCompletionClientBase,
)

class CuratedTransformersCompletion(TextCompletionClientBase):
    def __init__(
        self,
        model_name: str,
        device: Optional[int] = -1,
    ) -> None:
        """
        Use a curated transformer model for text completion.

        Arguments:
            model_name {str}
            device_idx {Optional[int]} -- Device to run the model on, -1 for CPU, 0+ for GPU.

        Note that this model will be downloaded from the Hugging Face model hub.
        """
        self.model_name = model_name
        self.device = "cuda:" + str(device) if device >= 0 and torch.cuda.is_available() else "cpu"

        self.generator = AutoGenerator.from_hf_hub(
            name=model_name, device=torch.device(self.device)
        )

    async def complete_async(
        self, prompt: str, request_settings: CompleteRequestSettings
    ) -> Union[str, List[str]]:
### top_k parameter required here
        generator_config = SampleGeneratorConfig(
            temperature=request_settings.temperature, top_p=request_settings.top_p, top_k=request_settings.top_k
        )
        try:
            with torch.no_grad():
                result = self.generator([prompt], generator_config)

            return result[0]

        except Exception as e:
            raise AIException("CuratedTransformer completion failed", e)

    async def complete_stream_async(self, prompt: str, request_settings: CompleteRequestSettings):
        raise NotImplementedError("Streaming is not supported for CuratedTransformersCompletion.")

kernel = sk.Kernel()

kernel.add_text_completion_service(
        "falcon-7b", CuratedTransformersCompletion(model_name="tiiuae/falcon-7b", device=0)
    )

config_dict = {
     "schema": 1,
    # The type of prompt
     "type": "completion",
    # A description of what the semantic function does
     "description": "Provides information about a capital city, which is given as an input",

    # Specifies which model service(s) to use
     "default_services": ["falcon-7b"],
    # The parameters that will be passed to the connector and model service
     "completion": {
          "temperature": 0.0,
          "top_p": 0.0,
          "top_k": 0.4, # <-- this is not read
     },

    # Defines the variables that are used inside of the prompt 
     "input": {
          "parameters": [
               {
                    "name": "input",
                    "description": "The name of the capital city",
                    "defaultValue": ""
               }
          ]
     }
}

prompt_config = PromptTemplateConfig.from_dict(config_dict)

prompt_template = sk.PromptTemplate(
    "{{$input}} is the capital city of", kernel.prompt_template_engine, prompt_template_config
)

function_config = SemanticFunctionConfig(prompt_template_config, prompt_template)
falcon_complete = kernel.register_semantic_function(
    skill_name="Falcon7BComplete", function_name="falcon7b_complete", function_config=function_config
)
print(falcon_complete("Paris"))

Expected behavior I would expect that any completion parameters defined in my config.json are passed to the model service. This will make things more flexible for non-OpenAI services.

Platform

Additional context This is also an issue when using the hugging face connector, it is not possible to pass additional parameters to control the generation settings

nacharya1 commented 1 year ago

@Chris-hughes10 thanks for providing this feedback we will take a look into this. cc: @lemillermicrosoft

matthewbolanos commented 11 months ago

This is related to #3312

eavanvalkenburg commented 8 months ago

This is now much easier and has been done for all current connectors.