[IEEE Transactions on Medical Imaging/TMI] This repo is the official implementation of "LViT: Language meets Vision Transformer in Medical Image Segmentation"
I'm wondering if there is an introduction for reproducing LViT-TW model. I've read the arxiv paper and found that there aren't detailed experiment settings for LViT-TW experiments. I've also checked the codes and found that in LViT text and image information are tightly combined and I don't know how to run the model without providing text information.
Could you give me some detailed information about how to run LViT-TW? Is it a possible solution that we use the same text for every pic while running LViT-T?
Hello! Thanks for your amazing work!
I'm wondering if there is an introduction for reproducing LViT-TW model. I've read the arxiv paper and found that there aren't detailed experiment settings for LViT-TW experiments. I've also checked the codes and found that in LViT text and image information are tightly combined and I don't know how to run the model without providing text information.
Could you give me some detailed information about how to run LViT-TW? Is it a possible solution that we use the same text for every pic while running LViT-T?