Closed zzzyzh closed 8 months ago
Another question is why you define dataloader in each epoch? I'm looking forward to your reply!
Hi, thank you for your excellent work!
I have a small question about (a) in Figure 3: Is Prompt: Class 4 a text, i.e. the name of the surgical instrument?
Hi,
Thanks for your interest in our work.
In SurgicalSAM, prompts are in the form of class IDs without any text content. These class IDs are represented by integer numbers, each corresponding to a specific class. You may refer to the code here to see the input of our model.
nother question is why you define dataloader in each epoch?
During training, we leverage pre-computed offline SAM image embeddings. To achieve data augmentation in an offline manner, we apply diverse transformations to augment original images, compute the SAM image embeddings of the augmented images, and save them into different versions (each version is an augmented copy of the whole training set). Each epoch utilises the training data of a specific version, and so we define a new dataloader in each epoch.
You could also perform data augmentation and compute SAM image embeddings online during training, which could potentially give better results due to more diverse augmentations.
Thank you for your reply!
One more small request, I wrote an email requesting your preprocessing data, if that's convenient for you.
Hi, thank you for your excellent work!
I have a small question about (a) in Figure 3: Is Prompt: Class 4 a text, i.e. the name of the surgical instrument?