Closed ShreyasFadnavis closed 4 months ago
Hi all for this fantastic work!
I wanted to ask what would be the best way to apply the pretrained model on a new classification dataset? https://github.com/med-air/Endo-FM/blob/main/scripts/test_finetune_polypdiag.sh
I see the script here, but am not sure if this would work on 1 video/ 1 image.
Is it possible to do this in a python way and not via command line?
Thanks in advance :)
Hi, This model giving metrices as an output and it will not store any predicted images in a output dir then how you will check using video???if it is possible give the code for it..it will be helpful for us too
Hi, @ShreyasFadnavis Thanks for your interest!
I am supposing that you want to fine-tune the pretrained Endo-FM model on your own dataset? You can refer to this script for fine-tuning: https://github.com/med-air/Endo-FM/blob/main/scripts/eval_finetune_polypdiag.sh. You need to do the following steps:
DATA_PATH
, and put the videos under the folder ${DATA_PATH}/videos
${DATA_PATH}/splits
, namely train.txt
and val.txt
.num_labels
to the number of classes for your taskMoreover, if you want to apply Endo-FM for image tasks, you can unsqueeze the image input to make it as a 1-frame video.
For only 1 video/image inference, unfortunately it is currently not supported here, maybe the easiest way to do this is to use a sample list with only one sample and run the code via https://github.com/med-air/Endo-FM/blob/main/scripts/test_finetune_polypdiag.sh
If you want to do inference on only 1 video/image, you can just change the dataset loader, for example, just load the specified video/image as the only sample in the dataset. Hope to be helpful to you~
Hi @Kyfafyd - this is very helpful! Lastly, is there a way to get frame level embeddings out of EndoFM? Something like following since you build on top of DINO:
Image -> EndoFM -> Embedding
Let me know if this is not clear?
Thanks in advance😊
Hi @ShreyasFadnavis It is clear, you may obtain the frame-level embeddings from Endo-FM after this line: https://github.com/med-air/Endo-FM/blob/c0979d2d235a61f85daf171a73a34f47683a67d9/models/timesformer.py#L339 with the code:
x = rearrange(x, '(b t) n m -> b t n m', b=B, t=T)
Thanks @Kyfafyd ! Closing this issue for now :)
@Kyfafyd Quick question: Is it possible to provide sample .txt
files that the model expects for train, val and test? I am confused about where the ground truth labels will be provided corresponding to each video.
Thanks!
Hi, you can refer to this txt file:
https://mycuhk-my.sharepoint.com/:t:/g/personal/1155167044_link_cuhk_edu_hk/EXvfI1xbf2FAguh3t7pXr5IBKw98D3L9ZBMmFvKQ5A4x2w?e=abcEqp
with a format as path,label
Thanks @Kyfafyd !
Hi all for this fantastic work!
I wanted to ask what would be the best way to apply the pretrained model on a new classification dataset? https://github.com/med-air/Endo-FM/blob/main/scripts/test_finetune_polypdiag.sh
I see the script here, but am not sure if this would work on 1 video/ 1 image.
Is it possible to do this in a python way and not via command line?
Thanks in advance :)