EIDOSLAB / torchstain

Stain normalization tools for histological analysis and computational pathology
MIT License
111 stars 20 forks source link

Add unit tests? #7

Closed andreped closed 2 years ago

andreped commented 2 years ago

As we have support for different backends, it would probably be a good idea to have unit tests to verify that we achieve the same results using the three different backends: pytorch, numpy, and tensorflow.

This can easily be achieved using GitHub Actions, which I have lots of experience with (e.g., this project). I could make an attempt tomorrow and create a PR when I have something working. Shouldn't take that much time.

Also I observed that lots of people have taken interest in the project. It would therefore be a good idea to maybe create a blog post or similar to further reach the target audience, as this project is extremely valuable for researchers withing computational pathology. Perhaps @carloalbertobarbano is interested in doing that?

andreped commented 2 years ago

I've added a workflow now, fixed some minor bugs, and used the unit tests which were already implemented in the tests/ directory. However, there seems to be something wrong. Have you tested these unit tests in a while, with the most recent version of torchstain?

See here for prompt when running the tests on different setups.

See here for the branch I am working on.

andreped commented 2 years ago

A PR has now been made https://github.com/EIDOSLAB/torchstain/pull/8.

carloalbertobarbano commented 2 years ago

Thank for the great contributions! I will happily merge it. I think I hadn't test the unit tests in a bit, I am currently busy with other projects. For the blog post, that is a good idea, however I think also a good example notebook (maybe on kaggle also?) could work for now

carloalbertobarbano commented 2 years ago

Do you think it is ready for packaging on pypi?

andreped commented 2 years ago

I think I hadn't test the unit tests in a bit.

That's kind of ironic :P

Oh yeah, and an example notebook, published on kaggle, is probably a very good idea :) However, I believe lots of people look for inspiration on medium and towardsdatascience. Might be a good idea to post in one of these channels as well.

andreped commented 2 years ago

Do you think it is ready for packaging on pypi?

No, not yet. We have to add tests to assert which TF and PyTorch versions are compatible, if we wish to bundle both of them together. However, as we discussed previously, it might be a good idea to offer pip install pytorch[backend] with a specific backend, but I have not done that before. I could make an attempt, if you'd like. I believe it requires us to make two submodules within the module, to switch between.

This is only relevant for the pytorch/TF backends. Numpy can be in both, as numpy is lightweight and both deps on it.


EDIT: If you want me to do it, I could make a new issue and have a PR ready when I have something working. I could also expand upon the unit tests to assess which TF and pytorch versions are compatible. I did something similiar in this project.

carloalbertobarbano commented 2 years ago

Yes, I am still convinced that the idea to have the backend switch is better (maybe defaulting to pytorch). Actually it should just be a matter of making sure that the right imports are done dynamically (i.e. try catch import, should be enough), and having the different targets in setup.py

The test suite is nice and necessary I think, you can definetely work on that if you like. I can dig a bit into the backend switch

carloalbertobarbano commented 2 years ago

I think I hadn't test the unit tests in a bit.

That's kind of ironic :P

Right :laughing:

Oh yeah, and an example notebook, published on kaggle, is probably a very good idea :) However, I believe lots of people look for inspiration on medium and towardsdatascience. Might be a good idea to post in one of these channels as well.

Of course, if you have a use-case ready (maybe connected to a publication), it would be perfect

andreped commented 2 years ago

Of course, if you have a use-case ready (maybe connected to a publication), it would be perfect

Right now I don't, but I have been experimenting with improved designs. In the literature, GANs seem to be the state-of-the-art method, but in my experience, GAN-based stain normalization methods seem to overfit too much to the task and dataset used. But if we get something working, and we benchmark it on some datasets, we could publish it with the tool, if you are interested. Are you collaborating with any researchers interested in this task?

carloalbertobarbano commented 2 years ago

I think I (and my lab) would be interested in doing some research for (not only) GAN-based approaches for stain normalization. So yes, we could definitely work together. We have ongoing collaborations with local hospitals (which led to https://arxiv.org/abs/2101.09991)

andreped commented 2 years ago

I think I (and my lab) would be interested in doing some research for (not only) GAN-based approaches for stain normalization. So yes, we could definitely work together.

I think GANs are great, but I see limitations in generalization and would therefore prefer to have a good old-fashioned algorithm as I would trust it more, similar to Macenko just more robust. So yeah, lets keep in touch, and if you start working on a use case and have an idea I am willing to contribute. Not just to the TF implementation :P

We have ongoing collaborations with local hospitals (which led to https://arxiv.org/abs/2101.09991)

Oh, thats funny! We have been working on similar tasks and published an open, annotated dataset ourselves :P Paper: https://www.frontiersin.org/articles/10.3389/fmed.2021.816281/full Dataset: https://dataverse.no/dataset.xhtml?persistentId=doi:10.18710/TLA01U

Dataset is probably of relevant for your group as well :]

carloalbertobarbano commented 2 years ago

That's great, thanks for the links! They might be useful for us, especially for the segmentation task.

I think GANs are great, but I see limitations in generalization and would therefore prefer to have a good old-fashioned algorithm as I would trust it more, similar to Macenko just more robust. So yeah, lets keep in touch, and if you start working on a use case and have an idea I am willing to contribute. Not just to the TF implementation :P

Sure, let's keep in touch. For the torchstain side, I wanted to also implement other normalization algorithms such as Vahadane etc. And for the research side yes, I think there is room for lots of work and improvement :+1:

andreped commented 2 years ago

Oh yeah, Vahadane would be great! That way torchstain could be a replacement to the now deprecated and popular StainTools. We could also consider adding augmentation options in the future. Quite easy to do, especially as this was already done in StainTools for Macenko and Vahadane.