Closed FabianBell closed 4 years ago
It seems that PoseNet uses a simple ResNet. Since they aiming to run it on mobile devices I assum that they take a simple one. I am not quite sure what about the offset vectors the PoseNet returns. I will try to figure that out.
Here is another approach for pose estimation in a drone environment. I am planning to check that.
I talked with pavel about the relightning robustness of the NN. He suggested to use data augmentation for this as well. We could use neural based relightning methods as a preprocessing step. Pavel is going to present more information about that on wednesday in the large scale maschine learning course. I am going to add the best model for relightning after the meeting.
I had a talk with Henrik on Tuesday after the kickoff and came up with some potential approaches for correcting the distance estimation errors. I will talk about it today at the regular meeting.
Most of the promising paper for 3d pose estimation uses multiple views in order to generate the 3D keypoints. This is not feasible to our situation.
After some research I think I will try this one https://arxiv.org/pdf/1712.06316v4.pdf. It uses a LSTM based approach. I like the Idea to save information of previous images and create a more robust image. They copy part of the structures from https://arxiv.org/pdf/1602.00134.pdf.
I have finished the implementation of the model. In the next step I will build the dataloader and the training procedure .
The input pipeline is ready :) Next Step loss + training procedure
For learning we also have to train the CPM layers before. Therefore I am focusing on extracting the weights from the caffe model.
I am able to extract all convolutional layer. Those layers are the only ones that have parameters. I am currently trying to figure out how to match them. Further I have two additional issues:
I have the following problems:
UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
I do not know what cases it since the error trace is destroyed by tensorflows graph abstraction.I spend several hours on that and I do not know hot to proceed.
I tried to rewrite the complete input pipeline as a generator. However, the dataloader stops after one iteration. After 3 hours of debugging I give up. If nobody wants to do this issue I will close it after this meeting and mark this approach as failed.
ooooooooooooooooooookay :( Tensorflow does not want a generator but a callable that returns GENERATOR. I do not like tensorflow.
I am not able to do that with tensorflow. I close the issue since there is no one who wants to do it.
We want to create a good python adaptation of the PoseNet model in order to fine tune it for your purposes. It turned out that the performance drops significantly under certain light conditions and when rotating the image. We are planning to apply transfer learning together with certain data augmentations for fine tuning.
Implementation: here
Tasks
Criteria