eric612 / BDD100k-toolkit

Convert bdd100k dataset to lmdb
MIT License
17 stars 1 forks source link

Issue converting to VOC format with coco2voc.py #1

Open jtcory opened 5 years ago

jtcory commented 5 years ago

I downloaded the BDD100K dataset and converted the labels to ms coco format using the git repo from here:

https://github.com/ucbdrive/bdd-data

I used the following command to convert the two .json files to ms coco format:

python3 -m bdd_data.bdd2coco -l bdd100k/labels -s bdd100k/detection_labels/

This produced two .json files in the detection_labels folder. Then I attempted to run the coco2voc.py using the following:

python coco2voc.py -l /data2/datasets/BDD100K/bdd100k/bdd-data-master/bdd100k/detection_labels/ -s /data2/datasets/BDD100K/bdd100k/bdd-data-master/bdd100k/

This produces the following output: numbers of images: 2 Traceback (most recent call last): File "coco2voc.py", line 191, in json2xml_citypersons(args.input_dir,args.save_path)
File "coco2voc.py", line 66, in json2xml_citypersons node_width.text = str(data['images']['width'])
TypeError: list indices must be integers, not str

Could you please advise?

eric612 commented 5 years ago

The bdd2coco.py will split json files one by one , the message you showed "numbers of images: 2" , it is very strange. It should above 60k images

jtcory commented 5 years ago

Thanks Eric! Perhaps I'm using the wrong json file from bdd100k? There are only two json files under the labels directory (one for train, one for val). Should I perhaps be using a different set of json files?

Thanks!

stormvirux commented 5 years ago

@jtcory did you by any chance figure this out

ghost commented 5 years ago

Hi Stormvirux, Unfortunately I did not. I gave up on attempting to use bdd100k for the task that I'm currently working on and just used PASCAL VOC.

eric612 commented 5 years ago

Sorry for late reply , the step2 output will split json file to each image annotation, which was just like MS-COCO dataset , and I just upload the step 3 output pascal annotation xml files here , Download it and save at folder like below

image image

Then cd folder [$bdd100k]

Python create_list.py train train
dishita26 commented 5 years ago

@eric612 can you upload the step 3 output pascal annotation xml files for val dataset. You have uploaded only for train dataset

eric612 commented 5 years ago

Sorry all , I forgot the split annotation steps , and I just upload it .

@dishita26 the train and val list file will be generated at spilt annotation step

dishita26 commented 5 years ago

@eric612 I have generated the text file, but I am not able to generate the lmdb files for val dataset. I am not able to convert the json format of val set to xml format

eric612 commented 5 years ago

@dishita26 I uploaded val lmdb here

sainisanjay commented 3 years ago

@eric612 I am successfully able to convert annotation into pascal annotation xml using your mentioned steps. Could you help to ignore some of the labels like i want only 5 classes from BDD datasets such as car, person, bus, truck and rider.

eric612 commented 3 years ago

@sainisanjay Just modify the label to background in prototxt, and then reproduce lmdb . For ex, set "traffic sign" label to zero .

sainisanjay commented 3 years ago

@eric612 I want to convert coco into voc not lmdb.