Project-MONAI / GenerativeModels

MONAI Generative Models makes it easy to train, evaluate, and deploy generative models and related applications
Apache License 2.0
613 stars 87 forks source link

Add Transformer network #2

Closed Warvito closed 1 year ago

Warvito commented 2 years ago

Add transformer network and components to make it compatible with VQ-VAE network. Create the components necessary to generate samples and likelihood of inputted data from the model. Add the relevant unit tests and documentation.

Warvito commented 1 year ago

As we discussed, the choice of the architecture is something to be discussed yet. think the main options are the implementation that we are currently using from lucidrains (https://github.com/lucidrains/x-transformers and https://github.com/lucidrains/performer-pytorch) and the current implementation based on MONAI core, with key components using xFormers (https://github.com/facebookresearch/xformers).

From one side, we are quite familiar with the lucidrains implementation, and in the end, it would be his transformer with a wrapper class. On the other side, using the current blocks from monai +xformer could be more flexible solution.

I will try to investigate further how the use of xFormers would look like (one problem might be that xFormers need pytorch >=1.12).

Warvito commented 1 year ago

Maybe something like this for a self attention block https://gist.github.com/Warvito/5c3363ddbf3941150c2511b27b75d701 .

Need yet check how it would look like when using masked attention for the autoregressive model and how it performs regarding memory and speed.

Warvito commented 1 year ago

@danieltudosiu which features from x-transformer were more useful for your models?

danieltudosiu commented 1 year ago

As we discussed, the choice of the architecture is something to be discussed yet. think the main options are the implementation that we are currently using from lucidrains (https://github.com/lucidrains/x-transformers and https://github.com/lucidrains/performer-pytorch) and the current implementation based on MONAI core, with key components using xFormers (https://github.com/facebookresearch/xformers).

From one side, we are quite familiar with the lucidrains implementation, and in the end, it would be his transformer with a wrapper class. On the other side, using the current blocks from monai +xformer could be more flexible solution.

I will try to investigate further how the use of xFormers would look like (one problem might be that xFormers need pytorch >=1.12).

I would argue, as per our last meeting, that we can go forward with a common interface based on the VQ-VAE + Transforner common codebase from KCL for the time being as those have been extensively used in KCL's publications.

Later, we can create another PR that might have breaking changes, adding building blocks and/or other models for more flexibility.

This approach would provide the required building blocks for us to offer the users the full pipelines while allowing us a later date to increase the flexibility of the package.

danieltudosiu commented 1 year ago

@danieltudosiu which features from x-transformer were more useful for your models?

The ones that I used from the X-Transfomer are the following:

I would highly advise against reimplementing anything for the short-term future. It will only increase the amount of code we need to maintain.

danieltudosiu commented 1 year ago

@Warvito I was about to work on this issue today, how do you want me to go forward with it? Or should I put it on hold for now?

Warvito commented 1 year ago

As we discussed, the choice of the architecture is something to be discussed yet. think the main options are the implementation that we are currently using from lucidrains (https://github.com/lucidrains/x-transformers and https://github.com/lucidrains/performer-pytorch) and the current implementation based on MONAI core, with key components using xFormers (https://github.com/facebookresearch/xformers). From one side, we are quite familiar with the lucidrains implementation, and in the end, it would be his transformer with a wrapper class. On the other side, using the current blocks from monai +xformer could be more flexible solution. I will try to investigate further how the use of xFormers would look like (one problem might be that xFormers need pytorch >=1.12).

I would argue, as per our last meeting, that we can go forward with a common interface based on the VQ-VAE + Transforner common codebase from KCL for the time being as those have been extensively used in KCL's publications.

Later, we can create another PR that might have breaking changes, adding building blocks and/or other models for more flexibility.

This approach would provide the required building blocks for us to offer the users the full pipelines while allowing us a later date to increase the flexibility of the package.

Sounds good. Let's start using the implementation that we are familiar (Lucidrains) and then we continue to discuss moving to something more flexible.

danieltudosiu commented 1 year ago

Then should I go forward with the Interface that we already have been using in the KCL codebase?

Warvito commented 1 year ago

Yes, I think it is a good starting point