YavorGIvanov / sam.cpp

MIT License
1.24k stars 52 forks source link

About unit test inference #17

Closed sanbuphy closed 9 months ago

sanbuphy commented 9 months ago

Thank you for the great work!

I would to ask if there are anything unit test that can make me inference the single image (just the function)? if not , can i contribute a pr?

moreover, does sam.cpp support vit_h model? or it just support the vit_b model?
Thank you very much .

YavorGIvanov commented 9 months ago

Sam.cpp supports all checkpoints of SAM - Vit_B, Vit_L and Vit_H

I am not really sure what do you mean by "unit test that can make me inference the single image (just the function)?". Can you explain what would be the exact input and output of such unit test ? The current example we have allows changing images by dropping new ones, but allows segmenting only one image by choosing points or hovering if you don't change the initial image.

sanbuphy commented 9 months ago

Sam.cpp supports all checkpoints of SAM - Vit_B, Vit_L and Vit_H

I am not really sure what do you mean by "unit test that can make me inference the single image (just the function)?". Can you explain what would be the exact input and output of such unit test ? The current example we have allows changing images by dropping new ones, but allows segmenting only one image by choosing points or hovering if you don't change the initial image.

Sam.cpp supports all checkpoints of SAM - Vit_B, Vit_L and Vit_H Thank's cool !

about "unit test that can make me inference the single image (just the function)?" The purpose of this unit test is to achieve minimal dependencies. I can compile this file by including the appropriate header files. The functionality of this file is to use OpenCV to read an image and provide click operations with a function that outputs the result. Additionally, there is another file that simulates multiple click actions (e.g., double-clicking). We know that the inputs needed for the first and second clicks are different, with the latter requiring an intermediate variable called "logit". For this, I only need the image and the points to generate the final mask. Afterward, the program can save the result as a local image using OpenCV. This is a standalone and simplified executable program.

YavorGIvanov commented 9 months ago

So there are two possible options:

  1. Basically you are talking about a new similar example without SDL, but using OpenCV ? Currently we are using stb_image to load the image and save it. We use SDL + OpenGL to render the image on the screen. If you want to create an additional and OpenGL example without SDL, that is fine, but I don't understand how introducing OpenCV will help here and achieve anything.

  2. Creating a simple unit test doesn't require actually either OpenCV nor SDL nor OpenGL. It can work only with stb_image, which is already in the repository. You can checkout the ggml SAM example https://github.com/ggerganov/ggml/tree/master/examples/sam. There we don't have any GUI, so just hardcoded point and input image from the command line arguments. The example only uses stb_image.

sanbuphy commented 9 months ago

So there are two possible options:

  1. Basically you are talking about a new similar example without SDL, but using OpenCV ? Currently we are using stb_image to load the image and save it. We use SDL + OpenGL to render the image on the screen. If you want to create an additional and OpenGL example without SDL, that is fine, but I don't understand how introducing OpenCV will help here and achieve anything.
  2. Creating a simple unit test doesn't require actually either OpenCV nor SDL nor OpenGL. It can work only with stb_image, which is already in the repository. You can checkout the ggml SAM example https://github.com/ggerganov/ggml/tree/master/examples/sam. There we don't have any GUI, so just hardcoded point and input image from the command line arguments. The example only uses stb_image.

Yes ! i actually need the second one , Thank you very much, I will take a try.

But I have another question ,May I ask how many milliseconds does it take for a single click of the home use CPU? It seems that the initial loading of the code takes around 2 seconds. @YavorGIvanov

YavorGIvanov commented 9 months ago

So there are two possible options:

  1. Basically you are talking about a new similar example without SDL, but using OpenCV ? Currently we are using stb_image to load the image and save it. We use SDL + OpenGL to render the image on the screen. If you want to create an additional and OpenGL example without SDL, that is fine, but I don't understand how introducing OpenCV will help here and achieve anything.
  2. Creating a simple unit test doesn't require actually either OpenCV nor SDL nor OpenGL. It can work only with stb_image, which is already in the repository. You can checkout the ggml SAM example https://github.com/ggerganov/ggml/tree/master/examples/sam. There we don't have any GUI, so just hardcoded point and input image from the command line arguments. The example only uses stb_image.

Yes ! i actually need the second one , Thank you very much, I will take a try.

But I have another question ,May I ask how many milliseconds does it take for a single click of the home use CPU? It seems that the initial loading of the code takes around 2 seconds. @YavorGIvanov

Depending on the CPU and the number of threads used (make sure you tune them) should be around 50-60ms

sanbuphy commented 9 months ago

So there are two possible options:

  1. Basically you are talking about a new similar example without SDL, but using OpenCV ? Currently we are using stb_image to load the image and save it. We use SDL + OpenGL to render the image on the screen. If you want to create an additional and OpenGL example without SDL, that is fine, but I don't understand how introducing OpenCV will help here and achieve anything.
  2. Creating a simple unit test doesn't require actually either OpenCV nor SDL nor OpenGL. It can work only with stb_image, which is already in the repository. You can checkout the ggml SAM example https://github.com/ggerganov/ggml/tree/master/examples/sam. There we don't have any GUI, so just hardcoded point and input image from the command line arguments. The example only uses stb_image.

Yes ! i actually need the second one , Thank you very much, I will take a try. But I have another question ,May I ask how many milliseconds does it take for a single click of the home use CPU? It seems that the initial loading of the code takes around 2 seconds. @YavorGIvanov

Depending on the CPU and the number of threads used (make sure you tune them) should be around 50-60ms

WOW! That's cool, Even for vit-H, is it also just a few tens of milliseconds? That's great!