facebookresearch / vissl

VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
https://vissl.ai
MIT License
3.24k stars 330 forks source link

3D support for VISSL #540

Open surajpaib opened 2 years ago

surajpaib commented 2 years ago

🚀 Feature

SSL has shown to be quite successful in 3D medical imaging contexts Ref: https://arxiv.org/abs/2101.05224

3D support within the VISSL framework would be a great feature to have if it aligns with what the team envisioned for it.

Motivation & Examples

Motivation: VISSL can be the choice of framework for training 3D SSL models in the medical imaging community, allowing simple and configurable pre-training for various use cases.

What is needed for it to be implemented? A preliminary list of additions to be made would be,

  1. Addition of 3D transforms to replace PIL / torchvision based transforms (from monai, for example: https://docs.monai.io/en/stable/transforms.html)
  2. Addition of 3D model heads (from monai, for example: https://docs.monai.io/en/stable/networks.html)
  3. 3D dataset examples

Progress

I've been working on implementing this functionality for a specific medical imaging use case and I've maintained my progress here: https://github.com/surajpaib/vissl

QuentinDuval commented 2 years ago

Hi @surajpaib,

Whoa, that's excellent! Thanks a lot for sharing this.

I had a quick look at your fork (https://github.com/surajpaib/vissl) and lot of things are indeed quite interesting for VISSL. There are a few things I will need to grasp for a proper code review, but this looks super promising to me!

@prigoyal what do you think about support for 3D dataset / model / transforms in VISSL? On my side, I am all for it.

Thank you, Quentin

surajpaib commented 2 years ago

Hi @QuentinDuval

Glad to hear that you find it useful! Let me know if there's anything I can do at this point to make things easier.

Thanks, Suraj

QuentinDuval commented 2 years ago

Hi @surajpaib,

I was reading through the fork and actually had a lot of questions for you. Overall, I think what you have done will require a bunch of different Pull Request to go through (that's a lot of work!).

For instance, we could split things like this:

What I miss for the moment is to understand how the code is used and what are the typical use cases, mostly I guess because of my lack of experience in medical imaging:

So to move forward, what I think I miss is some good references on 3D medical imaging (if you have any, I will be happy to read these), with examples of 3D datasets that are public, and a configuration / loss curve that allows me to better understand the context.

Since I find your work enormously impactful and useful for VISSL, I am available to chat / exchange with you to better understand the context (it might be easier to convey things if we have access to a chat or something better than comments on an issue). Please let me know if this is something you think would be useful.

Thank you, Quentin

surajpaib commented 2 years ago

Hi @QuentinDuval, apologies for my late response, I was on holiday for a good part of the past two weeks.

I agree with splitting the work into separate PRs. That should make it much easier to contribute from my end as well.

It would definitely be helpful to have access to a chat. We could discuss the rest of the points you mentioned in detail on there. Let me know what you would prefer to have as a platform to chat. You can drop me an email at surajballambat@gmail.com for convenience.

Thanks, Suraj

surajpaib commented 2 years ago

Hi @QuentinDuval, just checking in on the discussion above! Also a short update: I've extended several of https://github.com/Project-MONAI/MONAI 3D models to VISSL to do some ablation tests. So we might have a few more things to integrate into the 3D Models PR :)