[LLM] Base class design for LLMs

Querent-ai / querent-python-research

Querent

https://querent.xyz

Other

2 stars 0 forks source link

[LLM] Base class design for LLMs #2

Closed saraswatpuneet closed 9 months ago

debanjanamex commented 1 year ago

Something like this

class LLM:
    """Base class to implement a new LLM."""

    last_prompt: Optional[str] = None

    @property
    def type(self) -> str:
        """
        Return type of LLM.

        Raises:
            APIKeyNotFoundError: Type has not been implemented

        Returns:
            str: Type of LLM a string
        """
        raise APIKeyNotFoundError("Type has not been implemented")

    @abstractmethod
    def call(self, instruction: Prompt, value: str, suffix: str = "") -> str:
        """
        Execute the LLM with given prompt.

        Args:
            instruction (Prompt): Prompt
            value (str): Value
            suffix (str, optional): Suffix. Defaults to "".

        Raises:
            MethodNotImplementedError: Call method has not been implemented
        """
        raise MethodNotImplementedError("Call method has not been implemented")

    def generate_code(self, instruction: Prompt, prompt: str) -> str:
        """
        Generate the code based on the instruction and the given prompt.

        Returns:
            str: Code
        """
        return self._extract_code(self.call(instruction, prompt, suffix="\n\nCode:\n"))

debanjanamex commented 1 year ago

@saraswatpuneet this is a starting point and we build on these lines. The classes that inherits these classes will do the meat of the processing

saraswatpuneet commented 1 year ago

Yep similar, have some basic abstract methods defined, we can add more as need arises, at minimum, llm should be able to generate entities and relationships, chat and more if we find anything

I like that everyone is on board :)

saraswatpuneet commented 1 year ago

Designing base for LLM few things to keep in mind what faculties it will be used for Input data, ideally we can stream the data from ingestors into or how, do we need tokens or we can do a streaming one pass training on LLM as data is getting ingested ? Debanjan Datta and Nishant Gupta would know better as this is more on the research side for me Query interface : simple chat ? Some functions that can be used to communicate with their data with a chat interface Recommendations for entities and relationships Can we leverage vector space as a lens to LLM that can adjust it's weights accordingly ( I might be sounding crazy here )

So from a querent perspective our llms package must have such functionalities and possibly some good techniques we can find

debanjanamex commented 1 year ago

Collectors->Injestors->Tokenizer->LLM:

We can have 1 jester class for all kinds of collector class. The Tokenizer is best kept separate for single and multi modal encoders. Why do you think?

saraswatpuneet commented 1 year ago

I like it , I can imagine the flow of data through them in sequences which is awesome Yes I agree we should keep a separate implementation for each ;👍

saraswatpuneet commented 1 year ago

Cool yeah propose a PR ideally add some readme(s) that could be a good starting point while the db part etc comes up (in progress)