Open lluisgomez opened 7 years ago
Btw, please consider placing the file ssd_layers.py to code/layers/ better than in code/models/ as it is right now.
We tried to implement a new network from scratch based on YOLO, but we haven't it ready on time before Wednesday, but we could make it work and learn during yesterday, so even if we are late we will put it in the repository to show the idea and the problems that we had, because it was really time consuming.
Hi, implementing a new network, not matter how easy it is, has been a really tough task. Our main problem was on the Deconvolution layer, which comes out that when you are using K.dim_ordering as Theano and you are using Tensorflow, the shapes are mistaken. At the end, we used an unsampling layer instead. I already push the new net, and added to the REAMME.md file the task 'e' that it was missing. Here the task 'e'. My apologies for the delay.
Task (e): Boost the performance of your network
One of the main problems with YOLO is that the net cannot find small object in images. That is why, we implemented a novel method based on Tiny-YOLO. We thought that upscaling the input image we could detect better the small objects. Thus, what we do is to take the input image (input shape: (320,320,3)) and create two branches. The first branch will do a convolution (output shape: (320,320, 16)) as it is done in Tiny-YOLO. The second one is our contribution, we do an upsampling of the image(output shape: (640,640, 3)) and do the convolution (output shape: (640,640, 16)), once we have done the convolution we do a maxpool (output shape: (320,320, 16)). It is then, when we take this two branches and we merge them. After this merge layer (output shape: (320,320, 32)), the structure is the same that it was before in Tiny-YOLO. We called the new net Tiny-YOLT, You Only Look Twice. By doing that we want to solve the problem of missing detections for small objects. For running the experiment the config/Udacity_detection_YOLT.py is ready. We do not train the net, since it could take days or even a week. What we have seen is that the net is learning. We have try to build this net as a proof of concept. This work was based on the paper 'Locally Scale-Invariant Convolutional Neural Networks' (https://arxiv.org/pdf/1412.5104.pdf). We ended up with a much easier version of that.
I've been looking at your deliverable for weeks 3/4 and I think you have done a very good job. I particularly like the way you are presenting things in the READMe.md file. It is neat and clear.
The Overleaf article is well written. However, I miss some implementation details on the paper, so that results can be contextualized and reproduced: e.g. do you train from scratch or fine-tune the network?, which optimizer you use, how many epochs, base_lr? etc. Actually you have all this information on the README file so it's just a matter of summarizing it on the article.
In Tables 3 and 4 of the article I expect to see the final precision/recall and f-score results on the test sets. The avg. recall and avg. IoU metrics of YOLO implementation are a kind of auxiliary metrics in order to see if the model is learning correctly while training, but not really meaningful to compare the different methods. Same thing can be said for slide 11 on your presentation. Also on that slide, I think there is typo on the second row, should it be "Udacity YOLO" instead of "TT100K YOLO"? Please, add the SSD results on the presentation slides.
You should consider adding some images with qualitative results both in the paper and in the slides, I think they may help you to explain the obtained results.
I think you have done a very good work for these two weeks' assignment. Although I cannot see a solution to task (e), I acknowledge the extra work you have done for the SSD model integration, like trying to implement the f-score as a Keras metric and fixing the prediction functionality for detection models. Thus I guess your mark will be close to the maximum.