Closed songsong695 closed 1 year ago
Greetings! We appreciate your interest in our work.
(1) Our prompt methods include the box prompt applied to the entire image and point sampling across images using SAM prompts.
(2) To resize the image, you can utilize OpenCV.
Greetings! We appreciate your interest in our work.
(1) Our prompt methods include the box prompt applied to the entire image and point sampling across images using SAM prompts.
(2) To resize the image, you can utilize OpenCV.
Thanks for your reply. Regarding the second issue, I tried to adjust the input image size in the config file, but the training process failed and threw an error. Could you kindly inform me whether your network currently supports training with resolutions other than 1024? Thank you again for your kind assistance.
i have the same question, look forward to your reply, tanks!
i got the same question, case the sam's weights size: 1024 -> 64X64. so some layer's input size can not be changed. I still don't know how to change the input image size (i need it < 1024, because i only have 3090 gpu...)
@jiachen0212 @buriedms @syp66 Our approach uses an adapter-based method to sidestep the need for time-consuming fine-tuning of large models. We make use of the pre-trained weights of the original SAM model, which are designed to process inputs with a resolution of 1024. Consequently, modifying the input size can be challenging. We are still investigating memory efficient SAM model. At the current stage, it is recommended to upscale the input image using PIL or OpenCV instead of tweaking the network input size. If you encountered memory constraints for GPU, you can try using a smaller version of the SAM model (e.g. ViT-L) or switch to a different GPU.
i got the same question, case the sam's weights size: 1024 -> 64X64. so some layer's input size can not be changed. I still don't know how to change the input image size (i need it < 1024, because i only have 3090 gpu...)
@jiachen0212 Have you solved this problem? Because i also only have one 3090 GPU...
Greetings! We appreciate your interest in our work. (1) Our prompt methods include the box prompt applied to the entire image and point sampling across images using SAM prompts. (2) To resize the image, you can utilize OpenCV.
Thanks for your reply. Regarding the second issue, I tried to adjust the input image size in the config file, but the training process failed and threw an error. Could you kindly inform me whether your network currently supports training with resolutions other than 1024? Thank you again for your kind assistance.
After changing the input size, some parameters in the pretrained ViT model will not correspond, such as the relative position encodings, etc. To address this issue, is it feasible to allow the mismatched parameters to participate in training?
@jiachen0212 @buriedms @syp66 Our approach uses an adapter-based method to sidestep the need for time-consuming fine-tuning of large models. We make use of the pre-trained weights of the original SAM model, which are designed to process inputs with a resolution of 1024. Consequently, modifying the input size can be challenging. We are still investigating memory efficient SAM model. At the current stage, it is recommended to upscale the input image using PIL or OpenCV instead of tweaking the network input size. If you encountered memory constraints for GPU, you can try using a smaller version of the SAM model (e.g. ViT-L) or switch to a different GPU.
After changing the input size, some parameters in the pretrained ViT model will not correspond, such as the relative position encodings, etc. To address this issue, is it feasible to allow the mismatched parameters to participate in training?
The mismatch of ViT may because the incorrect configuration, please make sure you download the right pre-trained model of ViT (Vit-B, ViT-H are different)
Thanks for your excellent work. May I ask you two questions? (1) Did you use prompts to obtain the original SAM results in your paper? (2) What is the quickest way to change the input image size?