Closed kb-open closed 1 year ago
Hi @kb-open,
In ZeroShotGPTClassifier
no actual training is being done, we just use zero-shot prompts and extract the output. The main purpose of fit is to "memorize" the labels seen in the training set to use them for prompting and output validation.
Using GPTVectorizer
you can embed the text using ada-02
and add anything on top.
For now it is not possible neither to fine-tune the GPT models nor perform the tasks you mentioned. We are planning to add GPTSummarizer
as a preprocessor (similar to GPTVectorizer
) later this week, and fine-tuning options in the future (no fixed timeline for that though).
We did not evaluate the possibility of supporting the question-answering task yet, but we will.
Hi, Great work! I have 3 questions:
1) Refer to your example on Readme. As part of your
fit
method inZeroShotGPTClassifier
withgpt-3.5-turbo
as the model, are you basically freezing theada-02
embeddings and then adding some layer on top for the classification task? I'm asking this question because OpenAI APIs support fine-tuning only till GPT-3. 2) Or, are you simply using it as a zero-shot classifier, and no real training is happening? That is,fit
method is only mapping to some prompts that is relevant for a classification task? 3) How to use scikit-llm for fine-tuning (on private data) for tasks such as summarization or question-answering?Thanks!