Closed DanielCoelho112 closed 2 years ago
Hi @DanielCoelho112 ,
it sounds like a very good idea. But just so I understand completely, tell me if I got it right:
Is that it? I think I just repeated what you said, but now I am even more convinced it is a great idea.
One more big advantage of using synfeal.
1. what people do is use the first part of the networks with weights transferred from pre-trained google models;
Exactly.
\2. The reason for that is because they cannot get trained models dedicated for localization , because one would require vasts amounts of data. But what is the argument for not wanting these pre-trained models? Because they were focused on object detection, and we assume weights for localization would be different?
Actually, I can think of a couple of reasons: The main reason is the one you raised: Intuitively, a network training from the scratch in one task is better than training one in another task and then retraining in another task. Then, we also have the problem of flexibility. Imagine if I want to use RGBD images as inputs, how can I use the pretained weights of googlenet? Or imagine that I build a new network and I want to use it in localization...
3. Using "large enough" datasets allows the training of the model from scratch, and you propose to do this. For that you would need a dataset with "a lot more images that 8000" ...
Exactly.
I agree this is a very good point, my main concern is where is the boudary of the contributions we want to include in this paper... Just the idea of Synfeal for getting new datasets is already very good, do we want to add other contribution (training for scratch) on this paper... I think it is a good idea and give more strength to the paper/Idea but we have to defin clearly which contribution is for the paper and which we should leave for the next publications.
I would say that for this paper we should focus on a new data collection framework. This framework allows us to create accurate and larger datasets than current approaches. So, I think it makes sense to include this comparison as well.
My view about the papers is that it is better to have one very good paper than 5 above average. So I am always for including more stuff in the paper, provided it is clearly related as is the case.
Hi @miguelriemoliveira and @pmdjdias,
According to this paper the reason why everyone in localization is using pretained models from ImageNet is due to difficulty in acquiring large datasets. They say the following:
Training a neural network from scratch for the task of pose regression would be impractical for several reasons: we would need a really large training set.
Well, with Synfeal this is no longer a problem. What do you say if we add another section in the paper comparing the results of a "normal" dataset with pretained models vs. a "very large" dataset without pretained models?
We would have to define what "very large" means, but I think it could be good to demonstrate that good training data, in terms of quantity and quality, is no longer a problem with Synfeal.