SHI-Labs / StyleNAT

New flexible and efficient image generation framework that sets new SOTA on FFHQ-256 with FID 2.05, 2022
MIT License
99 stars 11 forks source link

Untested Fix #17

Closed stevenwalton closed 4 months ago

stevenwalton commented 4 months ago

Having NATTEN compile issues but want to address issue #15/#16 . Will fix NATTEN compile and then merge.

Fixed some hydra resolvers but also added a default conf which should make this former pointless.

stevenwalton commented 4 months ago

helpers.rng_reproducibility(args, ckpt)

Ops, good catch. I knew there would be some dumb errors I wouldn't catch. Another dumb mistake: in config, place

defaults:
    - default_conf
    - runs: ffhq_256
    - _self_

at the top. That's probably why you're not getting the override 🤦 (this was my first project with Hydra and I was still learning)

You don't need to validate the arguments or check the rng_reproducibility to run, especially for inference. There are just there to as helpers to log all the parameters into the checkpoints and to make things as reproducible as possible (including lottery). You can completely remove both functions and be fine.

For inference, you should just need to have the inference parameters. I thought the default conf I added should have a default value for everything but even that shouldn't be necessary if you remove the validation and rng.

It looks like the error you're running into is with distributed. You can set this to false. I'll make this the default to avoid future confusion. I left it on because this is pretty common.

Hydra is quite annoying when producing errors. If you run HYDRA_FULL_ERROR=1 python main.py type=inference distributed=false

(I'll get another machine. Should have this resolved very quickly)

justin4ai commented 4 months ago

@stevenwalton Thanks for your help but sadly still there are some errors (e.g. maximum recursion error to find the key 'runs' in .yaml, etc.) although I git cloned& pulled your stevenwalton:config_fix branch then typed python3 main.py type=inference both without changing anything and with applying changes your corrections.

Also I know I can turn off the distributed computing setting in every file, probably the point being the code doesn't recognize .yaml file properly - being confused or mixed with each other - when running main.py since it is already distributed: False in inference.yaml but printing the corresponding argument within the code yields True, therefore inference fails.

If you get another machine and are able to get it completely fixed quickly, I can wait for your next commit (or push)! I really appreciate your quick help once again :)

Best, Junyeong Ahn

stevenwalton commented 4 months ago

I'm in another machine right now and resolving. Stay tuned. It looks like some of the NA functions changed a bit too

stevenwalton commented 4 months ago

I want to make sure you aren't held back. I made a push where I have confirmed working python main.py type=inference distributed=False

This will produce a lot of warnings about some deprecated natten functions. These can safely be ignored for now. You should generate images. If you do this with no modifications you should get 10 images in a newly created folder called sample_images

justin4ai commented 4 months ago

@stevenwalton Thanks for your work!! It works but the output images are all the same as shown below for some reason.

3

justin4ai commented 4 months ago

Oh wait probably it's because I have no trained model? You don't provide checkpoints, right?

stevenwalton commented 4 months ago

Yeah you need the trained model. Do this

git pull
wget https://shi-labs.com/projects/stylenat/checkpoints/FFHQ256_940k_flip.pt
python main.py type=inference distributed=False +restart.ckpt=FFHQ256_940k_flip.pt

Here's the first 3 images I get 0 1 2

justin4ai commented 4 months ago

Oh my lord, it completely works!! I really appreciate your help for me although you must be busy. Now at least I can make some inferences and learn more about your Hydra-NA module deeply! Getting accessed to such a great project is a full of joy. Many thanks!!!

I hope you someday - definitely the faster the better especially for students like me lool - also complete train code. I'm sure it will be a big help to many learners and AI enthusiasts.

stevenwalton commented 4 months ago

Great to hear! There may be some minor issues still but I'll get these fixed soon. I spent time trying to make the code as reproducible as possible but I forgot to make it backwards compatible with the FFHQ checkpoint. This was the first experiment so it missed some things. Sorry about that.

I think it'll be hard to get training to be faster, but it is a topic of research we are actively exploring. There are some limits, and even this project suffers from lack of sufficient compute haha. It's an unfortunate thing and good research can often get overlooked by lack of compute. But don't let that stop you!

And when I have the full fix in I'll merge this PR. I'll also add a throughput measurement script I have in case you need this.

Also, is this the project you're using Hydra for?

justin4ai commented 4 months ago

@stevenwalton Your work is awesome thus it is really natural to miss somethings given such a lot of well-written code works!

About the PR, sure thanks. I think I need everything of your work now and then since I start being interested in image generation task for the past few days of struggling 😅

And yes it is. My main domain is face swapping but for this time it's my first trial to work on image generation - at first it was because of just a course project but I'm going way further hehe. I am supposed to come up with more creative ideas though. Anyways it is my kick-start ◡̈

stevenwalton commented 4 months ago

Thanks, I appreciate the kind words.

Generation is a challenging thing to learn, especially when GPU limited. That struggling is normal. But keep with it, it'll pay dividends in the long run. I'll suggest to carefully pay attention to images and not purely rely on benchmarks (also see the appendix of our paper).

FWIW, diffusion is more computationally heavy. That was a major motivation of demonstrating Hydra in a GAN instead of a diffusion model (this is being worked on btw). So if you're compute limited, try to start with GANs. They're much faster, easier to train, but have a tendency to have more limited diversity in output. If you're doing domain specific things like face swapping/editing, this is probably an acceptable tradeoff.

Keep it up, and don't hesitate to reach out.

justin4ai commented 4 months ago

I appreciate you share such an informative paper!! I'll definitely read about it.

Yeah actually there is a more complex background for my project. I've implemented and developed GAN-based face-swapping models in my company - I am both affiliated in uni and company. But they easily led to mode-collapse or unstable running in my experiences.

Thanks to a good opportunity I got a chance to use H100 server until this year, I made my mind to study diffusion models which are computationally more expensive in most of cases as you mentioned to fully make use of it.

And I came to know at last what I need to work on is to keep high-fidelity but achieve lower cost at the same time. That's why I came up with adopting your Hydra-NA module to discrete absorbing diffusion model. I expect Hydra-NA leads to significant runtime decrement compared to full-region attention while also preserving global-attentional power. And yes it is my assumption, so it might turn out to be wrong at the end haha. But I'm just trying not to hesitate and being too judgemental because I knew I lack experiences unlike you and can learn a lot from running about in confusion hehe.

It is so lucky for me I decided to reach out for your help in github. I really appreciate your kind help for a student on the other side of the world (South Korea)!! Georgia Tech is one of universities I dream of going to so please wait for me there ◡̈ 🔥

stevenwalton commented 4 months ago

That sounds great and good luck!

A note to be aware of, the kerneling and dilation is a bit trickier for diffusion. We've learned a lot about this since. If you're intending to write a paper and wish to collaborate you're welcome to email me. Or if you just need a few tips, it is open.

My partner is from South Korea too! Small world. And I think Humphrey is still recruiting students. You'll have to email him if you're interested. Showing some research will really help with getting into a good grad school. It is excessively competitive these days so don't let that get you down. You're on the right track.

justin4ai commented 4 months ago

@stevenwalton Sincerely thanks for your kind words. Those are really huge to me. I'll never let it get me down :)

By the way, I just realized you mentioned "but I forgot to make it backwards compatible with the FFHQ checkpoint" last night (it was 4 am at the moment so my mind wasn't 100% clear and now already 10 am though I was up all night fully focusing on my project work). Does it mean I won't be able to adopt Hydra-NA to my training code as it is for now? At least if Hydra-NA doesn't support backpropagation which means not learnable.

stevenwalton commented 4 months ago

You'll be fine, there is just a misunderstanding.

In models/stylenat.py you'll notice there is HydraNeighborhoodAttention and NeighborhoodAttentionSplitHead. The latter is the initial version we worked with and only splits into two head groups (like StyleSwin). Later we extended the idea so we could use an arbitrary number of splits. The older one was kept because we already had that checkpoint and felt this was a better option than modifying the checkpoint to be compatible with the newer format. This is why you'll get lines like

 UserWarning: Failed to make Hydra: attempting Legacy: kernel_size = [7] and dilation = [1]
  warnings.warn(f"Failed to make Hydra: attempting Legacy: "\

These are nothing to worry about. That's all I meant by "backwards compatible"

Use HydraNeighborhoodAttention

justin4ai commented 4 months ago

Ohh I see! I learn more about it thanks to your explanation. I hope I could have time to deeply research about such attention mechanisms during the upcoming vacation asap :) Interesting!

stevenwalton commented 4 months ago

I think things are good now. Let me know if there are any further problems. It's easy to miss things lol

justin4ai commented 4 months ago

Sure! I'll be the most passionate tester of yours haha. Thanks for your hard work!!

justin4ai commented 4 months ago

Heyya! I'd just like to let you know I successfully implemented Hydra-NA (actually 1D unlike your 2D module since mine is in the middle of GPT block for codebook prediction of discrete absorbing diffusion). It was possible literally mostly because of your kind help that allowed me to visually check the shape changes of tensor so you immediately came to mind right after completing it haha. Many thanks again!!

One frustrating point is the resulting module code is at most 100 line-length lol although the days I spent on this are almost a week lool. But still in the middle of the joy that there is much space left for me to learn about in computer vision field.

I've just started the first training of this module without sufficient in-advance investigation of configurations but as the time goes I hope I'll get to know the proper setting and see the improvement.

The first semester ends within two weeks so I am planning to more get into facial image generation mainly with the use of natten package including your work. Just an update to say thank you!