Fine Tune gpt-3.5-turbo-1106 for Tone Analysis

Task Description

Fine Tune gpt-3.5-turbo-1106 for Tone Analysis with Turkish Tweets 5 Label Tone Analysis Dataset. GPT should be able to correctly classify Turkish Tweets into one of the 5 classes given as : ["Kızgın", "Korku", "Mutlu", "Sürpriz", "Üzgün"]

Implementation Details

Prepare samples in JSONL format for Fine Tune operation as specified in the OpenAI API The example prompt samples used for training should include system messages telling the system to classify user input and the class value that the model should give in return. Save samples as JSONL and create a fine-tune job in OpenAI API for the gpt-3.5-turbo-1106 model.

Design and Tasks

Develop a script that converts tweet-tone data received in CSV format via Kaggle to the Fine Tune format given in the OpenAI API. This format should include a system message to the model instructing it to classify the given sentence, the content of the tweet as user input, and the output the model should provide as an assistant message. Split the dataset as Train, test. Develop a Python script that tests the base gpt-3.5-turbo-1106 model with test data to get the performance of the base model. Then start a Fine-tune job on the API with the training data. After training test the new fine-tuned model and compare it with the base model

Acceptance Criteria

The Fine-tuned model should result in significant improvement in the 5-label classification task.

FacVain / dil-asistanim