CAMMA-public / SSG-VQA

SSG-VQA is a Visual Question Answering (VQA) dataset on laparoscopic videos providing diverse, geometrically grounded, unbiased and surgical action-oriented queries generated using scene graphs.
http://camma.u-strasbg.fr/datasets
Other
23 stars 1 forks source link

SSG-Dataset Confusion #2

Closed gauravsh0812 closed 2 days ago

gauravsh0812 commented 1 month ago

Hi there! I am a PhD student collaborating with Dr. E.J. Lee at The University of Arizona. I recently came across your work on SSG-VQA and found it incredibly fascinating. I am eager to explore and potentially contribute to this project. As I dug deeper into the documentation, I encountered some initial confusion. According to the documentation, the dataset is expected to include Question-Answer pairs, along with features for Regions of Interest (RoI) and Cropped images sourced from the cholec80 dataset. Could you kindly guide how I can access the images that were utilized as samples in the dataset? Your assistance in this matter would be greatly appreciated.

Thank you!

Flaick commented 1 month ago

Hello, thank you for the interest. In this repo, we are providing the processed feature vector for each surgical scene image. But if you want to start from the scratch, here are the steps.

  1. Download CholecT45 dataset from https://github.com/CAMMA-public/cholect50/blob/master/docs/README-Downloads.md We use the images from this dataset to build our training and testing set.
  2. Use the feature extraction files under utils folder to generate the feature vectors for images. a. feat_extract_visual.py --> resnet visual feature vectors b. feature_extract_roi.py --> resnet visual feature vectors + scene-aware vectors. In this case, you will need scene graphs as additional input. The scene graph we are using will be provided soon.