Open yukiarimo opened 12 hours ago
Hey! Oh hmm for now you need to have 1 image paired with text during finetuning - I'm working on allowing (text only) + (text + image) finetuning, but for now that'll require a custom data collator
I see. My data collector is:
def formatting_prompts_func(examples):
texts = examples["text"]
return {"text": texts}
dataset = load_dataset("json", data_files="/content/drive/MyDrive/datasets/all.jsonl")
I would like to know how to put the image (image tokens) inside so I can hard code maybe and drop it raw dataset as I did before. Also, I would like to not use it in the beginning and maybe do multiple. Any suggestions?
I saw you used something like this:
But for (not vision) LLaMA 3.1 8B I used something like that:
So, can I do the same and what are these new options?
Also, my (raw text only) dataset looks like this:
So, for the image how do I do that? Can I make something like:
Note:
<yuki>, </yuki>, <yuna>, </yuna>, <data>, </data>, <kanojo>, </kanojo>, and <dialog>
are custom special tokens added by me in the vocabulary!