uncbiag / SimpleClick

SimpleClick: Interactive Image Segmentation with Simple Vision Transformers (ICCV 2023)
MIT License
209 stars 32 forks source link

Batching #6

Closed alexandrumeterez closed 1 year ago

alexandrumeterez commented 1 year ago

Hello! First of all, great work! I have just one question, since I can't quite get it from the code. Is it possible at inference time, to give batches of images and clicks? i.e. the input to the model is for example a batch of 16 images, each with its own set of clicks (of same length)?

Thanks!

qinliuliuqin commented 1 year ago

Hi, thanks for the question. Theoretically, it's possible to do batch inference to speed up automatic evaluation. However, it's not easy to modify the evaluation code to achieve this goal. The main issue is that different images may require different numbers of clicks, so it will be very tricky to handle this in evaluation.

alexandrumeterez commented 1 year ago

Hi @qinliuliuqin. Thanks for replying so fast.

I was thinking along the lines of same number of clicks per image. Basically the batch would have the shape (BS, C, H, W) for the images and (BS, N_clicks) for the clicks.

qinliuliuqin commented 1 year ago

@alexandrumeterez Thanks for the feedback. Yes, that's doable. Our code is based on RITM, which assumes BS=1 for inference. You may need to modify the code accordingly.