problem of get filtered_proposals for a custom dataset?

wangsun1996 commented 1 year ago

I read the questions on the discussion board，use SOCO code to get my custom dataset's filtered_proposals. SOCO code as this:

Prepare data with Selective Search

Generate Selective Search proposals

python selective_search/generate_imagenet_ss_proposals.py

Filter out invalid proposals with filter strategy

python selective_search/filter_ss_proposals_json.py

Post preprocessing for images of no proposals


python selective_search/filter_ss_proposals_json_post_no_prop.py
However, I found that my train_ratio3size0008@0.5.json format are not the same as the coco dataset you provided(Google Drive). How can i get the right format of train_ratio3size0008@0.5.json to use Aligndet training . Could you provide this code? Where did I go wrong？ Thank you very much!

liming-ai commented 1 year ago

I read the questions on the discussion board，use SOCO code to get my custom dataset's filtered_proposals. SOCO code as this:

Prepare data with Selective Search
Generate Selective Search proposals
python selective_search/generate_imagenet_ss_proposals.py
Filter out invalid proposals with filter strategy
python selective_search/filter_ss_proposals_json.py
Post preprocessing for images of no proposals
python selective_search/filter_ss_proposals_json_post_no_prop.py
However, I found that my train_ratio3size0008@0.5.json format are not the same as the coco dataset you provided(Google Drive). How can i get the right format of train_ratio3size0008@0.5.json to use Aligndet training . Could you provide this code? Where did I go wrong？ Thank you very much!

Thanks for your question. We employ the same procedures as SoCo to prepare the selective search proposals, so it is weird that the format is mismatched. Could you provide your results for reference?

FYI, the format of our JSON file should be aligned with the official COCO dataset, please also check that.

wangsun1996 commented 1 year ago

I first run selective_search/generate_imagenet_ss_proposals.py，each picture gets a.pkl file (step1) Then，i run filter_ss_proposals_json.py，modify imagenet_root=picture folder path，imagenet_root_proposals= the pkl file path obtained in step1，filter_strategy='ratio3size0308'. SO, i have got train_ratio3size0008.json and train_ratio3size0008no_props_images.txt (step2) Then, i run filter_ss_proposals_json_post_no_prop.py,json_path= the file path train_ratio3size0008.json in step2，json_path_post=train_ratio3size0008post.json，no_props_images=the file path train_ratio3size0008no_props_images.txt in step2，imagenet_root_proposals=the pkl file path obtained in step1. When this step is over，i have got train_ratio3size0008post.json (step3) But i found the json file format i got no same as your. json you provided include such as :{"images": [{"file_name": "000000176193.jpg", "width": 427, "height": 640, "id": 176193}, {"file_name": "000000019443.jpg", "width": 640, "height": 359, "id": 19443}, {"file_name": "000000304827.jpg", "width": 640, "height": 361, "id": 304827} and so on ,But my format is such as: {"00000008": [[0, 0, 799, 293], [304, 107, 104, 39], [248, 157, 163, 99], [0, 163, 69, 77], [517, 169, 282, 114], [389, 185, 110, 40], [480, 195, 47, 40], [407, 211, 64, 32], [542, 214, 57, 55] and so on.Can you tell me what the problem is? Thank you very much! train_ratio3size0008.json train_ratio3size0008no_props_images.txt train_ratio3size0008post.json

liming-ai commented 1 year ago

Hi, I believe this mismatch is caused by the different dataset formats between ImageNet and COCO. The implementation of selective search in SoCo is designed for ImageNet dataset, while our implementation is designed for COCO.

The JSON files like :{"images": [{"file_name": "000000176193.jpg", "width": 427, "height": 640, "id": 176193}, {"file_name": "000000019443.jpg", "width": 640, "height": 359, "id": 19443}, {"file_name": "000000304827.jpg", "width": 640, "height": 361, "id": 304827} is the original information in COCO annotations.

We only add bounding box, e.g., {"00000008": [[0, 0, 799, 293], [304, 107, 104, 39], [248, 157, 163, 99], [0, 163, 69, 77], [517, 169, 282, 114], [389, 185, 110, 40], [480, 195, 47, 40], [407, 211, 64, 32], [542, 214, 57, 55]} into original JSON file.

The best way is to integrate the selective search boxes into the original COCO JSON annotations files, just like our provided file.

Unfortunately, we did not upload the corresponding .py files when open-source the code, considering that I have left ByteDance where this paper worked, I may not be able to provide the corresponding code in this time, but the implementation should be easy. All you need to do is add proposals from selective search into original COCO annotations.

If you are not in a hurry, I can help you after the DDL for CVPR 2024. (Nov 17 '23 11:59 PM PST)

wangsun1996 commented 1 year ago

Hi, I believe this mismatch is caused by the different dataset formats between ImageNet and COCO. The implementation of selective search in SoCo is designed for ImageNet dataset, while our implementation is designed for COCO.

The JSON files like :{"images": [{"file_name": "000000176193.jpg", "width": 427, "height": 640, "id": 176193}, {"file_name": "000000019443.jpg", "width": 640, "height": 359, "id": 19443}, {"file_name": "000000304827.jpg", "width": 640, "height": 361, "id": 304827} is the original information in COCO annotations.

We only add bounding box, e.g., {"00000008": [[0, 0, 799, 293], [304, 107, 104, 39], [248, 157, 163, 99], [0, 163, 69, 77], [517, 169, 282, 114], [389, 185, 110, 40], [480, 195, 47, 40], [407, 211, 64, 32], [542, 214, 57, 55]} into original JSON file.

The best way is to integrate the selective search boxes into the original COCO JSON annotations files, just like our provided file.

Unfortunately, we did not upload the corresponding .py files when open-source the code, considering that I have left ByteDance where this paper worked, I may not be able to provide the corresponding code in this time, but the implementation should be easy. All you need to do is add proposals from selective search into original COCO annotations.

If you are not in a hurry, I can help you after the DDL for CVPR 2024. (Nov 17 '23 11:59 PM PST) Thank you for your answer！I looked at your COCO json, but couldn't find a property that represented the added selective search results.Would you mind telling me where you added the results？（e.g., {"00000008": [[0, 0, 799, 293], [304, 107, 104, 39], [248, 157, 163, 99], [0, 163, 69, 77], [517, 169, 282, 114], [389, 185, 110, 40], [480, 195, 47, 40], [407, 211, 64, 32], [542, 214, 57, 55]}

I will first try to modify according to your contact. In addition, if it is convenient, i want you could provide code after DDL

liming-ai / AlignDet

problem of get filtered_proposals for a custom dataset? #18

Prepare data with Selective Search

Prepare data with Selective Search