Can FiT3D be used for fine-tuning of SAM?

Hi @mappro6, thanks for your interest in our work. Yes, in principle FiT3D can be used to finetune (part of) SAM. In detail:

In the first stage, use image features from SAM encoder as target to train feature Gaussians. In the second stage, use rendered features from pre-trained feature Gaussians to finetune SAM encoder.

Then replace the original SAM encoder with finetuned encoder and keep other modules (e.g. mask decoder, prompt encoder) the same.