OpenAdaptAI / OpenAdapt

Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
https://www.OpenAdapt.AI
MIT License
885 stars 115 forks source link

Design: Fine-Tuning #69

Open abrichr opened 1 year ago

abrichr commented 1 year ago

Feature request

We would like to implement fine-tuning.

This task involves considering the tradeoffs between various approaches to improving action completions and outcome evaluation via fine-tuning.

More generally, this also involves:

  1. Creating a training set
  2. Fine tuning on that training set
  3. Comparing the results

Motivation

https://arxiv.org/abs/2406.03679

Autonomous agents that control computer interfaces to accomplish human tasks are emerging. Leveraging LLMs to power such agents has been of special interest, but unless fine-tuned on human-collected task demonstrations, performance is still relatively low.

Related

https://github.com/MLDSAI/OpenAdapt/issues/70 https://github.com/MLDSAI/OpenAdapt/issues/72 https://github.com/OpenAdaptAI/OpenAdapt/issues/415 https://github.com/OpenAdaptAI/OpenAdapt/issues/748

Bounty

A paid bounty is available. Please suggest a price range 🙏

FFFiend commented 1 year ago

Currently iterating on this issue through #327, by figuring out failure cases by testing various event sequences . To that end, current action items include: 1) Researching fine tuning on LLMs in general 2) writing a fine tuning pipeline for GPT-4 for Events. 3) generalizing the pipeline to arbitrary LLMs, the only exception being the model-specific API calls (HuggingFace and etc)

FFFiend commented 1 year ago

https://medium.com/@jeremyarancio/fine-tune-an-llm-on-your-personal-data-create-a-the-lord-of-the-rings-storyteller-6826dd614fa9

Useful article, goes over training and techniques like quantization and LoRA as well. Pretty educational to get an idea of what fine tuning an LLM looks like.

Some immediate action items may include:

1)working more closely with mind2web's codebase once they release the fine tuning code. The reason I say this is because training above in the article seems super black boxed to me, i.e it's not clear to me how/where the LLM is being shown the right answer to a given input when generating a completion.

2) Dataset of Window and Action Events. Can just distill from our recordings and pool to create a dataset comprising of these event Dicts that we can use for training, validation and testing.

abrichr commented 1 year ago

I think we want something like:

python -m openadapt.finetune --recording_id <recording_id> --model <model_name>
FFFiend commented 1 year ago

https://platform.openai.com/docs/guides/fine-tuning if you scroll down a little you can see that neither GPT-4 nor GPT-3.5-turbo are available for fine tuning at the moment 😞 We could use the davinci base model, although I'm now curious as to what model Mind2Web do their fine tuning on 🤔

abrichr commented 3 months ago

@bi-loop any interest? 🙏