OpenAI CLI Tools for Chat Fine-Tuning

henriqueln7 commented 1 year ago

Describe the feature or improvement you're requesting

Hello everyone,

When using legacy fine-tuning, I find the OpenAI CLI extremely helpful due to its numerous tools. For instance, the Prepare Data Helper and the Create Fine-Tuning are particularly useful.

However, these tools only apply to legacy models, which consist of JSON with prompt and completion keys.

I propose the addition of operations to the existing CLI that can perform the same functions for the new chat fine-tuning.

My Proposal

For the sake of backwards compatibility, we could create a new subcommand called chat_fine_tunes.
- This subcommand would inherit all operations that fine_tunes can perform, such as assisting with data preparation, etc. We can simply replicate the existing operations with minor modifications to suit the new format.

Additional context

I am open to working on this feature if it is approved.

mina6765 commented 1 year ago

hello

rattrayalex commented 11 months ago

Hi @henriqueln7 , do you remain interested in working on this? What interface would you propose?

henriqueln7 commented 11 months ago

Hey, @rattrayalex. I indeed remain interested in working on this :)

I propose the creation of a new subcommand called chat_fine_tunes. It would function as follows:


# This subcommand would assist with `.json, .jsonl` files. The formats `.csv, .txt, .tsv, .xlsx` seem incompatible with this new format (I am open to suggestions here).
# The new subcommand will perform the same operations that already exist:
# - Checking for potential improvements (removing duplicates, verifying the presence of system messages)
# - Generating a `file_prepared.jsonl` file suitable for fine-tuning
openai tools chat_fine_tunes.prepare_data -f <LOCAL_FILE>

# Create a fine_tune job
openai api chat_fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m <BASE_MODEL>

# List existing fine-tunings
openai api chat_fine_tunes.list

# Retrieve the status of a fine-tuning job. The output includes
# the job status (which can be pending, running, succeeded, or failed),
# among other details.
openai api chat_fine_tunes.get -i <YOUR_FINE_TUNE_JOB_ID>

# Cancel a fine-tuning job
openai api chat_fine_tunes.cancel -i <YOUR_FINE_TUNE_JOB_ID>

Questions

When I initially proposed this change, version 1.0 of the CLI had not been introduced. I noticed that all openai api fine_tunes commands were removed (although they are still mentioned in the documentation). Are there plans to also phase out the existing support for data preparation in the legacy manner? If that's the case, maybe it would be better for me to adapt the existing command rather than creating a new one.

rattrayalex commented 11 months ago

Thanks @henriqueln7 ! We'd be open to PR's for this. @jhallard can help with questions.

aanaseer commented 7 months ago

Hi, I see this issue has been pending for a while. I have developed a solution and would like to contribute by submitting a PR. Would that be alright with everyone involved @rattrayalex?

rattrayalex commented 7 months ago

Please do! PRs are always welcome.

openai / openai-python