Questions about base classes annotation for LVIS

tztztztztz commented 4 years ago

Hi, I'm curious about how to generate the annotation of base classes for LVIS, i.e. lvis_v0_5_train_freq and lvis_v0_5_train_common.

Here is what I implemented:

import json

save_names = {
    'f': 'lvis_v0.5_train_freq.json',
    'c': 'lvis_v0.5_train_common.json',
    'r': 'lvis_v0.5_train_rare.json'
}

with open('datasets/lvis/lvis_v0.5_train.json', 'r') as f:
    data = json.load(f)

anns = data['annotations']
cats = data['categories']

split_annos = {
    'f': [],
    'c': [],
    'r': []
}

for ann in anns:
    cat = cats[ann['category_id'] - 1]
    assert ann['category_id'] == cat['id']
    frequency = cat['frequency']
    split_annos[frequency].append(ann)

for name in save_names.keys():
    new_data = {
        'info': data['info'],
        'licenses': data['licenses'],
        'categories': data['categories'],
        'images': data['images'],
        'annotations': split_annos[name],
    }
    print('name {}, instance num {}'.format(name, len(split_annos[name])))
    with open(save_names[name], 'w') as f:
        json.dump(new_data, f)

Obviously, this Implementation does not take into account that some images may have both freq and common classes annotations. If I guess wrong, could you upload yours about it? Thank you very much.

thomasehuang commented 4 years ago

Hello, thank you for pointing this issue out. We forgot to include the script we used to split the data! I believe your implementation results in the same splits as us, but I will include the file we used anyways.

I just added the file to the repository, you can find it here datasets/split_lvis_annotation.py.

tztztztztz commented 4 years ago

Thanks for your help.

ucbdrive / few-shot-object-detection

Questions about base classes annotation for LVIS #4