KaihuaTang / Scene-Graph-Benchmark.pytorch

A new codebase for popular Scene Graph Generation methods (2020). Visualization & Scene Graph Extraction on custom images/datasets are provided. It's also a PyTorch implementation of paper “Unbiased Scene Graph Generation from Biased Training CVPR 2020”
MIT License
1.03k stars 228 forks source link

检测gt-box之间的关系 #72

Open zhangchenghua123 opened 3 years ago

zhangchenghua123 commented 3 years ago

❓ Questions and Help

您好,我使用bottom up attention(来源:https://github.com/airsplay/py-bottom-up-attention,我对它的理解是用faster rcnn在VG数据集上预训练)对整个coco2014数据集做测试,获得了gt-box和每个box对应的label,每个box用一个2048维的向量表示视觉特征,然后我想检测gt-box之间的关系。应该就是PredCls 。motifs的代码正如您所说,很炫技,我想能不能借用您的工作呢?我正在下载Examples of Pretrained Causal MOTIFS-SUM models中的“PredCls Download”。接下来我要怎么做呢?是根据evaluation中的步骤进行配置的修改,还是根据test on custom images中的步骤呢?十分感谢!

zhangchenghua123 commented 3 years ago

我看了maskrcnn_benchmark/data/datasets/visual_genome.py,好像对于custom images,不能提供gt_box,我说的对吗?

Einstone-rose commented 3 years ago

你好,我最近也在研究这个。我也是想使用36 fixed bottom up feature来利用pretrained model生成它们之间的关系。按理说,应该是可以直接从bottom up feature(包含gt_box和labels的)直接生成关系的(直接使用pretrained model,但是这里的输入就不是images了,而是需要自己做一些修改(对dataloader做修改),直接用 bottom up feature(包含gt_box和labels的)作为输入。这是我的理解。

zhangchenghua123 commented 3 years ago

你好,我最近也在研究这个。我也是想使用36 fixed bottom up feature来利用pretrained model生成它们之间的关系。按理说,应该是可以直接从bottom up feature(包含gt_box和labels的)直接生成关系的(直接使用pretrained model,但是这里的输入就不是images了,而是需要自己做一些修改(对dataloader做修改),直接用 bottom up feature(包含gt_box和labels的)作为输入。这是我的理解。

是的,我也是这么想的。我在看motifs代码的时候,也尝试过修改其中dataloader的代码,将bottom up feature作为dataset。原来的代码获取VG数据集的gt_box是通过读取http://cvgl.stanford.edu/scene-graph/dataset/VG-SGG.h5这个文件的,其中box进行了缩放,因此我不太知道如何正确的对gt_box进行缩放处理。这个vg-sgg.h5有说明文档吗?

zhangchenghua123 commented 3 years ago

@Einstone-rose 也许我应该从这开始, https://github.com/danfeiX/scene-graph-TF-release/tree/master/data_tools , 谢谢你分享了你的想法。

Einstone-rose commented 3 years ago

Yes it is. BTW, i have some questions. (a) 我从这个链接https://github.com/peteanderson80/bottom-up-attention获得的bottom up features只有bounding box和region features, 没有class 信息,请问包含的class label信息的bottom up features可以在哪里找到呢? (另外如果还有attribute信息的话更好)(b) 另外我在做SGDet任务的时候, 使用custom images(VG数据集)作为输入的时候,代码里出现了bug,有以下三种都出现过: (1)AttributeError: 'VGDataset' object has no attribute 'ind_to_classes' (2)AttributeError: 'VGDataset' object has no attribute 'img_info' (3)indexError: out of index的错误 都是出现在读取数据的时候 不知道你有没有出现过,欢迎讨论thanks

zhangchenghua123 commented 3 years ago

@Einstone-rose Yes, as you said, the bottom up feature provided by the author does not contain class andattribute information. in the original repro: (a) you can check the demo.ipynb. in the tools/demo.ipynb,the variable "cls_prob" and "attr_prob" contain the class score and attribute sorce,respectively ,ithink. because the following "cls" and "attr_conf" are calculated from them.you can save "cls" as class and attr as attribute. (b) I haven't tried Sgdet yet

by the way , when i use the original bottom-up-attention. For me,it can only runs on the cpu,too slow.so i find this repro, https://github.com/airsplay/py-bottom-up-attention which is implemented by pytorch, with the same model and weight. you can try it. also ,i run the demo/demo_feature_extraction_attr_given_box.ipynb to get the class and attribute good luck!

Einstone-rose commented 3 years ago

Thanks a lot and thanks to https://github.com/airsplay/py-bottom-up-attention which highly caters to my desire. In this repro, it is easy to extract the 36 fixed features, bounding boxs, class labels and attributes by runing the scripts py-bottom-up-attention/demo/detectron2_mscoco_proposal_maxnms.py. Next, i can utilize the obtained bounding boxs for the input of SGCls model to produce the realtionships between the objects. #74

Einstone-rose commented 3 years ago

BTW, i find that the extratced features based on this repro https://github.com/airsplay/py-bottom-up-attention slightly differ from that in original repro https://github.com/peteanderson80/bottom-up-attention. The author has explained the reason. Also, if i want to directly download the extract feattures , can i download these from this repro https://github.com/airsplay/lxmert (In this repro, in VQA section 3): Download faster-rcnn features for MS COCO train2014 (17 GB) and val2014 (8 GB) images (VQA 2.0 is collected on MS COCO dataset). The image features are also available on Google Drive and Baidu Drive (see Alternative Download for details). mkdir -p data/mscoco_imgfeat wget --no-check-certificate https://nlp1.cs.unc.edu/data/lxmert_data/mscoco_imgfeat/train2014_obj36.zip -P data/mscoco_imgfeat unzip data/mscoco_imgfeat/train2014_obj36.zip -d data/mscoco_imgfeat && rm data/mscoco_imgfeat/train2014_obj36.zip wget --no-check-certificate https://nlp1.cs.unc.edu/data/lxmert_data/mscoco_imgfeat/val2014_obj36.zip -P data/mscoco_imgfeat unzip data/mscoco_imgfeat/val2014_obj36.zip -d data && rm data/mscoco_imgfeat/val2014_obj36.zip

Dong-G12 commented 3 years ago

你好,我最近也在研究这个。我也是想使用36 fixed bottom up feature来利用pretrained model生成它们之间的关系。按理说,应该是可以直接从bottom up feature(包含gt_box和labels的)直接生成关系的(直接使用pretrained model,但是这里的输入就不是images了,而是需要自己做一些修改(对dataloader做修改),直接用 bottom up feature(包含gt_box和labels的)作为输入。这是我的理解。

你好,我最近也在研究PredCls(给定gt_box和labels),因为我想将其用于家庭环境中的物体关系检测,所以vg数据集中有的标签不准确,我们自己制作了一个小的数据集,标注了物体类别和gt_box,所以我想将其用于检测关系中,但是我目前还不知道应该怎么给定输入,不知道怎么修改代码(代码基础比较薄弱),所以可以分享一下你的想法吗,非常感谢!!!(PS:刚刚发现咱两是一个学校的噢)

Einstone-rose commented 3 years ago

你好,我最近也在研究这个。我也是想使用36 fixed bottom up feature来利用pretrained model生成它们之间的关系。按理说,应该是可以直接从bottom up feature(包含gt_box和labels的)直接生成关系的(直接使用pretrained model,但是这里的输入就不是images了,而是需要自己做一些修改(对dataloader做修改),直接用 bottom up feature(包含gt_box和labels的)作为输入。这是我的理解。

你好,我最近也在研究PredCls(给定gt_box和labels),因为我想将其用于家庭环境中的物体关系检测,所以vg数据集中有的标签不准确,我们自己制作了一个小的数据集,标注了物体类别和gt_box,所以我想将其用于检测关系中,但是我目前还不知道应该怎么给定输入,不知道怎么修改代码(代码基础比较薄弱),所以可以分享一下你的想法吗,非常感谢!!!(PS:刚刚发现咱两是一个学校的噢)

Hi, our task looks quite similar, where my task is also predicting the relationships given the gt_box and lablels.

I have modified the dataloader (how to give input) recently, and I've taken a closer look at the code, i suggest that you can take /maskrcnn_benchmark/data/datasets/visual_genome.py and /maskrcnn_benchmark/data/datasets/coco.py as reference to generate your own dataloader. (BTW, i think /maskrcnn_benchmark/data/datasets/list_dataset.py is a more simple and easlily understandable dataloader framework, based on which you can modify something. As for /maskrcnn_benchmark/data/datasets/visual_genome.py, it is a bit complicated, because it loads many other things (realtionship s annotation and mask, etc.) that our task (bbox and class labels) don't need, you can delete them directly.)

Specifically, you can locate to following (This is simplified code for better understanding, which is in visual_genome.py ):

def __getitem__(self, index):

    `img = Image.open(self.filenames[index]).convert("RGB")`

    `target = self.get_groundtruth(index, False)`

    `if self.transforms is not None:`

        `img, target = self.transforms(img, target)`

    `return img, target, index`

Where, the variable target should contains the class label and BoxList (a new class for encapsulating bbox, please refer to /maskrcnn_benchmark/structures/bounding_box.py more details), also you can enter the function self.get_groundtruth to know how to generate target, as following:

target = BoxList(box, (w, h), 'xyxy') # xyxy

target.add_field("labels", torch.from_numpy(self.gt_classes[index]))

That' all. I'll finish writing the dataloader soon and testing, and releasing the code. Haha, we're probably from a lab (iLearn), Is it? Welcome to communicate! Good luck!

Dong-G12 commented 3 years ago

你好,我最近也在研究这个。我也是想使用36 fixed bottom up feature来利用pretrained model生成它们之间的关系。按理说,应该是可以直接从bottom up feature(包含gt_box和labels的)直接生成关系的(直接使用pretrained model,但是这里的输入就不是images了,而是需要自己做一些修改(对dataloader做修改),直接用 bottom up feature(包含gt_box和labels的)作为输入。这是我的理解。

你好,我最近也在研究PredCls(给定gt_box和labels),因为我想将其用于家庭环境中的物体关系检测,所以vg数据集中有的标签不准确,我们自己制作了一个小的数据集,标注了物体类别和gt_box,所以我想将其用于检测关系中,但是我目前还不知道应该怎么给定输入,不知道怎么修改代码(代码基础比较薄弱),所以可以分享一下你的想法吗,非常感谢!!!(PS:刚刚发现咱两是一个学校的噢)

Hi, our task looks quite similar, where my task is also predicting the relationships given the gt_box and lablels. I have modified the dataloader (how to give input) recently, and I've taken a closer look at the code, i suggest that you can take /maskrcnn_benchmark/data/datasets/visual_genome.py and /maskrcnn_benchmark/data/datasets/coco.py as reference to generate your own dataloader. (BTW, i think /maskrcnn_benchmark/data/datasets/list_dataset.py is a more simple and easlily understandable dataloader framework, based on which you can modify something. As for /maskrcnn_benchmark/data/datasets/visual_genome.py, it is a bit complicated, because it loads many other things (realtionship s annotation and mask, etc.) that our task (bbox and class labels) don't need, you can delete them directly.) Specifically, you can locate to following (This is simplified code for better understanding, which is in visual_genome.py ): def getitem(self, index): img = Image.open(self.filenames[index]).convert("RGB")

`target = self.get_groundtruth(index, False)`

`if self.transforms is not None:`

    `img, target = self.transforms(img, target)`

`return img, target, index`

Where, the variable target should contains the class label and BoxList (a new class for encapsulating bbox, please refer to /maskrcnn_benchmark/structures/bounding_box.py more details), also you can enter the function self.get_groundtruth to know how to generate target, as following: target = BoxList(box, (w, h), 'xyxy') # xyxy target.add_field("labels", torch.from_numpy(self.gt_classes[index])) That' all. I'll finish writing the dataloader soon and testing, and releasing the code. Haha, we're probably from a lab (iLearn), Is it? Welcome to communicate! Good luck!

非常感谢!!!真的帮了我很大的忙,之前看代码只是看了大概框架,我接下来好好看看dataloader相关代码,你给的建议真的非常有用!再次感谢!希望以后多多交流!如果可以的话希望可以加个微信或者qq好友,1761520306@qq.com,这是我的QQ邮箱,如果方便的话希望你可以将联系方式发我邮箱。期待和你的交流!(PS:哈哈,我是在济南千佛山这个校区,控制学院的。)

Thinking-more commented 2 years ago

你好,我最近也在研究这个。我也是想使用36 fixed bottom up feature来利用pretrained model生成它们之间的关系。按理说,应该是可以直接从bottom up feature(包含gt_box和labels的)直接生成关系的(直接使用pretrained model,但是这里的输入就不是images了,而是需要自己做一些修改(对dataloader做修改),直接用 bottom up feature(包含gt_box和labels的)作为输入。这是我的理解。

你好,我最近也在研究PredCls(给定gt_box和labels),因为我想将其用于家庭环境中的物体关系检测,所以vg数据集中有的标签不准确,我们自己制作了一个小的数据集,标注了物体类别和gt_box,所以我想将其用于检测关系中,但是我目前还不知道应该怎么给定输入,不知道怎么修改代码(代码基础比较薄弱),所以可以分享一下你的想法吗,非常感谢!!!(PS:刚刚发现咱两是一个学校的噢)

Hi, our task looks quite similar, where my task is also predicting the relationships given the gt_box and lablels.

I have modified the dataloader (how to give input) recently, and I've taken a closer look at the code, i suggest that you can take /maskrcnn_benchmark/data/datasets/visual_genome.py and /maskrcnn_benchmark/data/datasets/coco.py as reference to generate your own dataloader. (BTW, i think /maskrcnn_benchmark/data/datasets/list_dataset.py is a more simple and easlily understandable dataloader framework, based on which you can modify something. As for /maskrcnn_benchmark/data/datasets/visual_genome.py, it is a bit complicated, because it loads many other things (realtionship s annotation and mask, etc.) that our task (bbox and class labels) don't need, you can delete them directly.)

Specifically, you can locate to following (This is simplified code for better understanding, which is in visual_genome.py ):

def __getitem__(self, index):

    `img = Image.open(self.filenames[index]).convert("RGB")`

    `target = self.get_groundtruth(index, False)`

    `if self.transforms is not None:`

        `img, target = self.transforms(img, target)`

    `return img, target, index`

Where, the variable target should contains the class label and BoxList (a new class for encapsulating bbox, please refer to /maskrcnn_benchmark/structures/bounding_box.py more details), also you can enter the function self.get_groundtruth to know how to generate target, as following:

target = BoxList(box, (w, h), 'xyxy') # xyxy

target.add_field("labels", torch.from_numpy(self.gt_classes[index]))

That' all. I'll finish writing the dataloader soon and testing, and releasing the code. Haha, we're probably from a lab (iLearn), Is it? Welcome to communicate! Good luck!

您好,请问您是否在custom image上进行了Predcls呢?如果已经成功的话,能否说一下细节呀?感谢!