arcadelab / FastSAM3D_slicer

A 3D Slicer extension for FastSAM3D
https://arxiv.org/abs/2403.09827
14 stars 2 forks source link

About Inference #2

Closed jianjun0407 closed 6 months ago

jianjun0407 commented 6 months ago

In your published FastSAM3D, find that the test function is validation.py, where the label image is required to test the test image. However, the test function called in 3Dslicer can support user-defined prompt without the Label image of the test image.Can the test function in 3Dslicer be published?

skill-diver commented 6 months ago

What do you mean the test_function?

jianjun0407 commented 6 months ago

First of all, thank you for your quick reply! The test function here refers to validation.py in FastSAM3D (https://github.com/arcadelab/FastSAM3D). In FastSAM3D, I found that even in infer.sh, validation.py was called. However, during the execution of validation.py, the Label image corresponding to the testing data is needed. 捕获 But in 3Dslicer, I understand that FastSAMD_slicer should read a test image, the user can click a few points on the region of interest, and then call the algorithm, and the corresponding segmentation results can be obtained. Like SAM.(https://segment-anything.com/demo) In this application scenario, the Label image corresponding to the current test image should not be needed. I would like to ask whether the inference function in this application scenario can be published, because not every image we have Label images.

swedfr commented 6 months ago

Validation.py needs label images because we need this to calculate the dice between label and mask. However, since this extension is for visualization, we don't provide the method to calculate the dice. So you can directly use our model in 3D slicer directly with prompts and don't need to provide a label.

jianjun0407 commented 6 months ago

Indeed, the calculation of IoU, dice metrics is a major reason for the need for Label images. However, without the label image, there is another problem, which is the transformation of the coordinates of the prompt. Specifically, when I was debugging the FastSAM3D source code(https://github.com/arcadelab/FastSAM3D), because there was no Label image, the values of the two variables "batch_points" and "batch_labels" in the following figure were no longer sampled on the label image, but I entered them manually. 捕获

One problem I faced at this point was that I displayed the test image through itk-snap, then determined the region of interest and recorded the coordinates of several points, assigning values to "batch_points" and "batch_labels". However, I find that there is a problem in doing this, that is, the coordinate system displayed by itksanp may be different from the coordinate system used by the algorithm, resulting in errors given by prompt.

To summarize, I would like to ask how the variable assignments "batch_points" and "batch_labels" are handled when FastSAM3D is applied in 3Dslicer? Because in the FastSAM3D source code, these two variables are sampled in the Label, not given by the user.

swedfr commented 6 months ago

For the transformation between RAS and XYZ coordinates(I don't know which coords itk-snap use, but for 3D slicer, it use RAS), you can use the affine matrix store in the header of nifti file which I also use in this 3D slicer extension. Here is the code: coords = np.round(np.linalg.inv(self.logic.affine) @ np.array(coords).T). You can use this code to transfer the coordinated of the points and do inference without labels. For points_labels, it's just binary value which used to classify between include points and exclude points. Also, if you want to use this 3D slicer extension, you can directly put the point on images without transfer and get the mask.

swedfr commented 6 months ago

Also, the affine matrix is a 4x4 matrix, if you want to use it, you need to add a 1 to the end of coords vector and the first three value of the result is the XYZ coords.

swedfr commented 6 months ago

Also, I don't know if this works for itk-snap, but for 3D slicer, the coords is in reverse order which means after calculate xyz coords, you need to reverse it like this. coords = np.array([coords[2],coords[1],coords[0]])

jianjun0407 commented 6 months ago

Yes, these explanations you said are exactly my doubts. Next, I will try the 3Dslicer extension you provided, thank you.