keras-team / keras-contrib

Keras community contributions
MIT License
1.58k stars 650 forks source link

additional datasets - coco + voc2012 #47

Open ahundt opened 7 years ago

ahundt commented 7 years ago

I'm working on some code that loads ms coco and pascal voc for segmentation, but it would currently involve steps that depend on several other python libraries, including tf only steps. It could be used to create examples for the DenseNetFCN and #46 segmentation models.

I'm creating an example that trains on these keras-contrib models but with the dependency limitations mentioned above, and I can modify it for the keras-contrib code structure. However, this brings up the following questions:

If it makes the most sense for me to just keep everything separate, that's okay.

patyork commented 7 years ago

What are the external libraries (and what are they used for)? Also, what are the steps that can only be done in TF?

ahundt commented 7 years ago

I'm sure they can be done with other tools, they are currently implemented with TF and would need to be rewritten to work with numpy arrays. Actually it probably could be done fairly easily without TF I've been using it a bit more since the previous post.

Pascal VOC 2012 + Berkeley additional labeling

Extra for coco:

ahundt commented 7 years ago

I've made useful progress towards Pascal VOC and MS COCO in other repositories, which I plan to merge into keras-contrib when ready. Details are in https://github.com/ahundt/Keras-FCN/tree/densenet_atrous

One issue is that with more complex datasets it makes sense to have several manual utilities & external dependencies may be required.

How can the manual utility + dependency needs of large datasets be reconciled with the need to avoid dependencies when possible with Keras?

How should a single dataset dependency, like pycocotools, be handled?

ahundt commented 7 years ago

80 adds pascal_voc support

ahundt commented 7 years ago

81 adds ms coco support

ahundt commented 7 years ago

Some improvments to #81 should be made based on https://github.com/PavlosMelissinos/enet-keras/blob/master/src/data/datasets.py

tarvaina commented 7 years ago

The Pascal VOC installation failed on Python 3. I submitted pull request #114 to fix the problem.

ahundt commented 7 years ago

Thanks!

ahundt commented 7 years ago

I'm also working on extending it for running via TFRecords but that may be a while before the PR is ready.

ahundt commented 7 years ago

If you might be interested in incorporating it the TFRecord code to be manually moved into keras-contrib is at https://github.com/warmspringwinds/tf-image-segmentation/pull/25 and https://github.com/fchollet/keras/pull/6928 adds support to Keras