JosephKJ / OWOD

(CVPR 2021 Oral) Open World Object Detection
https://josephkj.in
Apache License 2.0
1.02k stars 153 forks source link

Fine-tuning questions and the dataset splits method? #123

Open HongdaChen opened 1 year ago

HongdaChen commented 1 year ago

Scripts

With the help of chatGPT, the following script can output the intersection of several *.txt files:

def read_file(filename):
    with open(filename, 'r') as file:
        lines = file.readlines()
        numbers = [line.strip() for line in lines]
        result = set(numbers)
        print(f"{filename} has {len(result)} images")
        return result

def find_intersection(files):
    if len(files) < 2:
        raise ValueError("At least two files are required for finding the intersection.")

    sets = [read_file(file) for file in files]
    intersection = set.intersection(*sets)
    print(f"intersection num of {files} is {len(intersection)}")
    return intersection

# Example usage
# files = ['t1_train.txt', 't2_train.txt', 't3_train.txt', 't4_train.txt']  # replace with the actual paths to your files
files = ['t2_ft.txt', 't3_ft.txt']
# files = ['t2_train.txt', 't2_ft.txt']
intersection = find_intersection(files)

Find the dataset split method under the hood

root@46a2a355a17d:/owod_master/datasets/OWOD_imagesets# python find_intersection.py 
t1_train.txt has 16551 images
t2_train.txt has 45520 images
t3_train.txt has 39402 images
t4_train.txt has 40260 images
intersection num of ['t1_train.txt', 't2_train.txt', 't3_train.txt', 't4_train.txt'] is 0
root@46a2a355a17d:/owod_master/datasets/OWOD_imagesets# python find_intersection.py 
t1_train.txt has 16551 images
t2_train.txt has 45520 images
t2_ft.txt has 1743 images
intersection num of ['t1_train.txt', 't2_train.txt', 't2_ft.txt'] is 0
root@46a2a355a17d:/owod_master/datasets/OWOD_imagesets# python find_intersection.py 
t2_train.txt has 45520 images
t2_ft.txt has 1743 images
intersection num of ['t2_train.txt', 't2_ft.txt'] is 1330
root@46a2a355a17d:/owod_master/datasets/OWOD_imagesets# python find_intersection.py 
t1_train.txt has 16551 images
t2_ft.txt has 1743 images
intersection num of ['t1_train.txt', 't2_ft.txt'] is 413
root@46a2a355a17d:/owod_master/datasets/OWOD_imagesets# python find_intersection.py 
t2_train.txt has 45520 images
t3_ft.txt has 2361 images
intersection num of ['t2_train.txt', 't3_ft.txt'] is 1402
root@46a2a355a17d:/owod_master/datasets/OWOD_imagesets# python find_intersection.py 
t1_train.txt has 16551 images
t3_ft.txt has 2361 images
intersection num of ['t1_train.txt', 't3_ft.txt'] is 374
root@46a2a355a17d:/owod_master/datasets/OWOD_imagesets# python find_intersection.py 
t3_train.txt has 39402 images
t3_ft.txt has 2361 images
intersection num of ['t3_train.txt', 't3_ft.txt'] is 938
root@46a2a355a17d:/owod_master/datasets/OWOD_imagesets# python find_intersection.py 
t2_ft.txt has 1743 images
t3_ft.txt has 2361 images
intersection num of ['t2_ft.txt', 't3_ft.txt'] is 107
root@46a2a355a17d:/owod_master/datasets/OWOD_imagesets#