mihirp1998 / AlignProp

AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
https://align-prop.github.io/
MIT License
242 stars 8 forks source link

Finetuning and Config used for HPS #17

Open anonymous-atom opened 2 weeks ago

anonymous-atom commented 2 weeks ago

@mihirp1998 I was tryin to finetune Stable Diffusion 1.5 using your HPS reward function and the hps.sh training script, I used batch size of 1 but still the training seems to get completed very quickly, 50 epochs just took 2-4 minutes.

def from_file(path, low=None, high=None):
    prompts = _load_lines(path)[low:high]
    return random.choice(prompts), {}

And here you are trying to just use batch_size number of prompts ? I am using batch_size of 2 on 1 A100 GPU to test the script.

    def _generate_samples(self, batch_size, with_grad=True, prompts=None):
        """
        Generate samples from the model

        Args:
            batch_size (int): Batch size to use for sampling
            with_grad (bool): Whether the generated RGBs should have gradients attached to it.

        Returns:
            prompt_image_pairs (Dict[Any])
        """
        prompt_image_pairs = {}

        sample_neg_prompt_embeds = self.neg_prompt_embed.repeat(batch_size, 1, 1)

        if prompts is None:
            prompts, prompt_metadata = zip(*[self.prompt_fn() for _ in range(batch_size)])

Your help will mean a lot!

mihirp1998 commented 2 weeks ago

Can u check whether the training is actually getting run ? if not why is it skipping the training loop

anonymous-atom commented 2 weeks ago

It's working but I think it's only using few examples from the prompt file.

mihirp1998 commented 2 weeks ago

i think random.choice is uniform over all prompts, i'm not sure what's the bug here. If you find it let me know.

anonymous-atom commented 2 weeks ago

Yeah Sure!

anonymous-atom commented 2 weeks ago

Also while I was tryin to train with a custom loss function, the models seem to collapse very early unless I adjust the learning rate. is this the expected behaviour ?

anonymous-atom commented 2 weeks ago

@mihirp1998 Sorry to tag you again, but can you let me know how much time it took per epoch on your 4 A100 GPU's ?

Clear Skies!

mihirp1998 commented 2 weeks ago

On Aesthetics 2-3 minutes per epoch

On Sat, Nov 9, 2024 at 12:07 AM Karun @.***> wrote:

@mihirp1998 https://github.com/mihirp1998 Sorry to tag you again, but can you let me know how much time it took per epoch on your 4 A100 GPU's ?

Clear Skies!

— Reply to this email directly, view it on GitHub https://github.com/mihirp1998/AlignProp/issues/17#issuecomment-2466052573, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE4C5LG33CWOFGYKS3SESDTZ7WJ7TAVCNFSM6AAAAABROVLMA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRWGA2TENJXGM . You are receiving this because you were mentioned.Message ID: @.***>

anonymous-atom commented 2 weeks ago

I wanted to confirm are you using all 750 prompts in hps_v2_all.txt file for 1 single epoch ?

anonymous-atom commented 1 week ago

Hi @mihirp1998 , so this is where I am confused:

So you used step() function to do 1 training step/ 1 epoch ?

    def train(self, epochs: Optional[int] = None):
        """
        Train the model for a given number of epochs
        """
        global_step = 0
        if epochs is None:
            epochs = self.config.num_epochs
        for epoch in range(self.first_epoch, epochs):
            global_step = self.step(epoch, global_step)

And here in step() function, it only seems to finetine on _num_gpus batch_size train_gradient_accumulationsteps number of images, am I missing something ? What if someone used just 1 GPU to train ?


    def step(self, epoch: int, global_step: int):

        info = defaultdict(list)
        print(f"Epoch: {epoch}, Global Step: {global_step}")

        self.sd_pipeline.unet.train()

        for _ in range(self.config.train_gradient_accumulation_steps):
            with self.accelerator.accumulate(self.sd_pipeline.unet), self.autocast(), torch.enable_grad():
                prompt_image_pairs = self._generate_samples(
                    batch_size=self.config.train_batch_size,
                )
findalexli commented 1 week ago

following this thread. this is critical setup otherwise other folks cannot reproduce this

mihirp1998 commented 1 week ago

Hi @mihirp1998 , so this is where I am confused:

So you used step() function to do 1 training step/ 1 epoch ?

    def train(self, epochs: Optional[int] = None):
        """
        Train the model for a given number of epochs
        """
        global_step = 0
        if epochs is None:
            epochs = self.config.num_epochs
        for epoch in range(self.first_epoch, epochs):
            global_step = self.step(epoch, global_step)

And here in step() function, it only seems to finetine on _num_gpus batch_size train_gradient_accumulationsteps number of images, am I missing something ? What if someone used just 1 GPU to train ?

    def step(self, epoch: int, global_step: int):

        info = defaultdict(list)
        print(f"Epoch: {epoch}, Global Step: {global_step}")

        self.sd_pipeline.unet.train()

        for _ in range(self.config.train_gradient_accumulation_steps):
            with self.accelerator.accumulate(self.sd_pipeline.unet), self.autocast(), torch.enable_grad():
                prompt_image_pairs = self._generate_samples(
                    batch_size=self.config.train_batch_size,
                )

Yes in this codebase step and epoch are equivalent, it's difficult to define an epoch as there is no dataset i'm training on.

If someone used one gpu for training and wants to maintain the batchsize i'm using, they should increase the accum_steps as i mention here:

https://github.com/mihirp1998/AlignProp/blob/5e950b3f16ded622df15f4bea2eec93f88962f2b/hps.sh#L1