LoRA models for ImageReward?

Hey there! First off I just want to say thank you for this amazing project you've put together. It's been very helpful as I've been working on an Image generation subnet for bittensor https://github.com/unconst/ImageSubnet in which we utilize your model.

In short, there are validators on the network which all use the same model, however we want to be able to have the validators express unique preferences for certain styles and qualities within art that is generated on the network. Creating a full dataset and retraining the model seems a bit intensive.

I was wondering if you think it would be at all possible to create some sort of LoRA for this Image Reward model, such that you could provide a handful of images (say 50-100) of a variety of styles and scores both high rank and low rank, and modify the network enough to produce a stylistic change?

Thanks again for you work!

THUDM / ImageReward

LoRA models for ImageReward? #63