iryna-kondr / scikit-llm

Seamlessly integrate LLMs into scikit-learn.
https://beastbyte.ai/
MIT License
3.38k stars 275 forks source link

What is the `fit` method actually doing? #11

Closed kb-open closed 1 year ago

kb-open commented 1 year ago

Hi, Great work! I have 3 questions:

1) Refer to your example on Readme. As part of your fit method in ZeroShotGPTClassifier with gpt-3.5-turbo as the model, are you basically freezing the ada-02 embeddings and then adding some layer on top for the classification task? I'm asking this question because OpenAI APIs support fine-tuning only till GPT-3. 2) Or, are you simply using it as a zero-shot classifier, and no real training is happening? That is, fit method is only mapping to some prompts that is relevant for a classification task? 3) How to use scikit-llm for fine-tuning (on private data) for tasks such as summarization or question-answering?

Thanks!

OKUA1 commented 1 year ago

Hi @kb-open,

  1. In ZeroShotGPTClassifier no actual training is being done, we just use zero-shot prompts and extract the output. The main purpose of fit is to "memorize" the labels seen in the training set to use them for prompting and output validation.

  2. Using GPTVectorizer you can embed the text using ada-02 and add anything on top.

  3. For now it is not possible neither to fine-tune the GPT models nor perform the tasks you mentioned. We are planning to add GPTSummarizer as a preprocessor (similar to GPTVectorizer) later this week, and fine-tuning options in the future (no fixed timeline for that though).

  4. We did not evaluate the possibility of supporting the question-answering task yet, but we will.