X-PLUG / mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Apache License 2.0
1.12k stars 68 forks source link

Finetuning Tinychart #72

Open ViCtOr-dev13 opened 1 month ago

ViCtOr-dev13 commented 1 month ago

Hello, I would like to fine-tune or train TinyChart to improve its summarization skills. I have the impression that it doesn't capture all the data during summarization, whereas it does during data conversion into tables. I can't find a fine-tuning template code specifically for TinyChart. Can you please help me?

zhangliang-04 commented 1 month ago

Hi @ViCtOr-dev13 , You are right. The chart-to-table task requires the model to extract all the data presented in the chart, but the summarization task doesn't. This is related to the text style of the summaries in the training data. We plan to release the training code in early June. If you are in urgent demand, you can refer to the training code of TinyLlava, which is the repo we are based on.

zhangliang-04 commented 1 month ago

Hi @ViCtOr-dev13, We just released the training code in this repo. Please try to pull the latest code. Have fun!

ViCtOr-dev13 commented 4 weeks ago

Hello @zhangliang-04 , I saw you released the training code, thank you. I've tried it but I'm dealing with some issues. I'm looking for finetuning your model so I uncomment this part in train.sh

#!/bin/bash
TRAIN_DATA=../data_test/train_finetune_captions.json
TEST_DATA=../data_test/test_finetune_captions.json

#LLM_PATH=bczhou/TinyLLaVA-3.1B
#VIT_PATH=pretrained_models/TinyLLaVA-3.1B-SigLIP

#If you want to fine-tune TinyChart-3B-768:
LLM_PATH=mPLUG/TinyChart-3B-768
VIT_PATH=mPLUG/TinyChart-3B-768-siglip

First does it possible to finetune like this and also does a T4 GPU 15g RAM is enough thank you for your time.