yuvalkirstain / PickScore

MIT License
373 stars 20 forks source link

how much data would be needed to finetune your model ? #4

Closed scarbain closed 8 months ago

scarbain commented 1 year ago

Hi, thanks for this great model, can't wait to try it out! It was exactly what I needed for multiple projects!

Not exactly an issue but more of a question : How much data do you think would be needed to finetune the model ?

Starting from scratch would require a lot (I'm seeing 968,965 rankings in your paper) but starting from your checkpoint, could the model understand preference with a few hundreds or thousands examples ?

Thanks :)

yuvalkirstain commented 1 year ago

I think that it would. Perhaps try to use a small learning rate. Please update here how it goes, I am curious :)

scarbain commented 1 year ago

I will ! I'll run some tests later today ! Can I train on 24gb VRAM or do I need to rent some cloud GPU ?

yuvalkirstain commented 1 year ago

yea sure, you might need to use gradient accumulation and a smaller batch size than I used.

scarbain commented 1 year ago

Just a quick note, there's the line "import torch" missing in the inference script in the readme

Other than that, the inference script is working great and inference is really speed even with 8 images! If finetuning on a small dataset works, it'd be great to implement a simple scoring extension + filtering on large batches in auto1111 !

yuvalkirstain commented 1 year ago

Thanks! Fixed the readme, would you be able to create such a PR for auto111? It will be great if you do so

scarbain commented 1 year ago

I've never implemented an auto1111 extension so I'll have to check if it's not too complicated. First, I'll verify that finetuning on a small dataset does work as intended!

yuvalkirstain commented 1 year ago

I think that having PickScore in auto1111 can be useful in itself, and having the ability to FT on a small dataset is independently very useful :) In any case, keep me posted, I have gained experience in few-shot learning, there are small tricks that make a big difference.

scarbain commented 1 year ago

Hello!

Well surely, having Pickscore in auto1111 would be useful alone to sort the generations. There already are some extensions using previous methods to calculate aesthetics scores. Maybe this extension should be built on top of one of them.
I found this one : https://github.com/tsngo/stable-diffusion-webui-aesthetic-image-scorer But it doesn't seem to be updated anymore.

About finetuning, I'm having some trouble trying to use a local dataset. Can you provide me some hints on how the dataset should be formatted (in terms of files and their contents) and where to modify the code to include it ?

Also, if you can share the small tricks that make the big difference, I'm all ears!!

That would be much appreciated :)

yuvalkirstain commented 8 months ago

Hey, we open sourced the dataset that we used for training, feel free to check it out :) (I try to share everything I know, so most details should be in the paper)