videturfortuna / vehicle_reid_itsc2023

33 stars 5 forks source link

veriwild directory structure #13

Closed jsteinberg-rbi closed 7 months ago

jsteinberg-rbi commented 8 months ago

VeRi-Wild comes as 23 rar files as pictured below. Some rar files contain subdirectories of the same name. The CustomDataSet4VERIWILD extension of PyTorch's DataSet class takes a ROOT_DIR path that it expects all the jpg's to be at? I'm curious how you organized your veriwild dataset without overwriting directories with the same name, e.g. 00526 which occurs twice throughout the database:

find veri-wild/images/ -name '05266' -type d
veri-wild/images/images.part23/images/05266
veri-wild/images/images.part03/images/05266

I can see ways to change the csv files to match my directory structure, but I'm wondering if you have a more expeditious solution.

Screenshot 2024-03-14 at 7 39 15 PM
videturfortuna commented 7 months ago

First you need to download the rar's and extract them all. You just need to extract with unrar, I believe something like this should work:

unrar x "*.*.rar"

You will obtain the dataset as it should be, a folder containing two folders images/ and a train_test_split/, then you should change ROOT_DIR for the "/images/" folder, and give the .txt associated to train or test e.g. "/train_test_split/train_list.txt".

jsteinberg-rbi commented 7 months ago

will prep PR for this step as I wasted a bunch of time on macos fiddling with getting the dataset right and this command is precisely what I needed to run!