Open slamjie opened 5 years ago
@slamjie Are you able to train the pose-net on your own custom images.?
@DRAhmadFaraz Yes,I have trained PoseNet with my own images successfully. It shows good result for me.
@slamjie Thanx a lot for replying me, actually I also need to train this pose-net on my own collection of RGB Images so please can you Guide me how to train this code on our own custom RGB images.?
I will be thankful to you.
@DRAhmadFaraz Sorry to reply you so late. Recently I am busy with graduation matters.
I have reproduced the PoseNet according to the author's instructions.
The only difference is that the training parameters need to be re-adjusted according to the number of images.
Here is my solver_posenet.prototxt
:
net: "./posenet/models/train.prototxt"
test_initialization: false
test_iter: 250
test_interval: 200
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 2500
display: 20
max_iter: 30000
solver_type: ADAGRAD
weight_decay: 0.005
snapshot: 2000
snapshot_prefix: "./posenet/models/training"
solver_mode: GPU
I got the training_iter_30000.caffemodel
,then I used test_posenet.py
to test the PoseNet.
The result shows like this:
Its accuracy is not as good as mentioned in the paper, but it has met my needs. I hope these are useful to you.
@slamjie Thanx a lot for helping me, these all steps would be helpful for me. I have one last question. Where should I have to put the input images data-set directory path.? in which file.?
further I didnt see the author's instructions for custom images on this repository. If you have such instructions,share its link with me.
I will be thankful to you. Regards
@DRAhmadFaraz For training the PoseNet, I use the creat_test_lmdb.py
and create_train_lmdb_.py
to get the LMDB files.
import sys
sys.path.append('/home/jaco/caffe-posenet-master/python')
import numpy as np
import lmdb
import caffe
import random
import cv2
import argparse
directory = '/home/jaco/realsense/train/'
dataset = 'train.txt'
poses = []
images = []
with open(directory+dataset) as f:
for line in f:
fname, p0,p1,p2,p3,p4,p5,p6 = line.split()
p0 = float(p0)
p1 = float(p1)
p2 = float(p2)
p3 = float(p3)
p4 = float(p4)
p5 = float(p5)
p6 = float(p6)
poses.append((p0,p1,p2,p3,p4,p5,p6))
images.append(directory+fname)
r = list(range(len(images)))
random.shuffle(r)
print 'Creating PoseNet Dataset.'
env = lmdb.open('Train', map_size=int(1e12))
count = 0
for i in r:
if (count+1) % 100 == 0:
print 'Saving image: ', count+1
X = cv2.imread(images[i])
X = cv2.resize(X, (455,256)) # to reproduce PoseNet results, please resize the images so that the shortest side is 256 pixels
X = np.transpose(X,(2,0,1))
im_dat = caffe.io.array_to_datum(np.array(X).astype(np.uint8))
im_dat.float_data.extend(poses[i])
str_id = '{:0>10d}'.format(count)
with env.begin(write=True) as txn:
txn.put(str_id, im_dat.SerializeToString())
count = count+1
env.close()
Then I use the create_mean.sh
to get the mean files for train and test data-set.
#!/usr/bin/env sh
set -e
cd /home/jaco/caffe-posenet-master/build/tools
DBTYPE=lmdb
echo "Computing image mean..."
./compute_image_mean -backend=$DBTYPE $1 $2
echo "Done."
I change the train_posenet.prototxt
like this:
name: "GoogLeNet"
layers {
top: "data"
top: "label"
name: "data"
type: DATA
data_param {
**source: "posenet/scripts/Train/"**
batch_size: 64
backend: LMDB
}
include {
phase: TRAIN
}
transform_param {
mirror: false
crop_size: 224
**mean_file: "posenet/scripts/train.binaryproto"**
}
}
layers {
top: "data"
top: "label"
name: "data"
type: DATA
data_param {
**source: "posenet/scripts/Test/"**
batch_size: 1
backend: LMDB
}
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 224
**mean_file: "posenet/scripts/test.binaryproto"**
}
}
These bold text are the path of input images data-set and their mean files. It just lets the model know where the LMDB files and mean files are.
Finally I use the train.sh to train the PoseNet and get the caffemodel files.
#!/usr/bin/env sh
TOOLS=./build/tools
$TOOLS/caffe train \
--solver=posenet/models/solver_posenet.prototxt
I used 1000 images for training and 250 images for testing.
I hope it's useful to you. :)
@slamjie Thanx a lot, Its all clear for me. and you are so cooperative person I have ever met. I understand it all. just had one basic confusion and I am really sorry for such basic question but I am new on caffe.
I want to ask that in my own Custom Images.. All I have is the dataset of RGB images in one directory folder...
so I want to ask that in code of creat_test_lmdb.py
the line is dataset = 'train.txt'
so how can I able to get train.txt
file from input image's dataset directory.
I hope after that, I ll be able to train my own images... Thanx a lot and waiting for your last response.
@DRAhmadFaraz The data in the train.txt
is a six-degree-of-freedom pose corresponding to your own images.
Can you guide me how to extract six-degree-of-freedom pose from a RGB image.?
@DRAhmadFaraz The author mentions in the paper that he uses structure from motion (SfM) to get the images and their corresponding poses. For me, I use a visual positioning device in my school lab. When I get these images from camera, I can get the pose of the camera in real time.
@slamjie Thanx a lot. I did all these steps and after ./train.sh
command My code get struct here.
Is it hardware issue or something else as I even reduced the step size to 1.?
While in solver_posenet.prototxt
file I did solver_mode: CPU
from solver_mode: GPU
and the code starts training as shown.
and testing results shows..
But why I cant run this on my GPU.? My GPU is Nvidia GT 940. 4GB
@DRAhmadFaraz This may be caused by memory. My GPU is Nvidia 1080. You may need to adjust the parameters in the train_posenet.prototxt for your GPU.
@slamjie Thanx a lot, you helped me alot, I have trained Successfully on "Cambridge Landmarks dataset", Now all I will have to do is to get thetrain.txt
a six-degree-of-freedom pose corresponding to my own images.
Do you know any tool which can extract Structure from Motion (SFM), 6-DOF or something like this from a collection of RGB images.? like in a format given below.
Visual Landmark Dataset V1
ImageFile, Camera Position [X Y Z W P Q R]
/seq-01/frame-000000.color.png -0.123234 -1.120697 -0.988706 0.995174 -0.096421 0.016435 0.006233
/seq-01/frame-000001.color.png -0.136318 -1.122137 -0.988546 0.994928 -0.098053 0.020571 0.007535
/seq-01/frame-000002.color.png -0.136108 -1.122399 -0.989466 0.994811 -0.099181 0.020964 0.007190
/seq-01/frame-000003.color.png -0.136539 -1.121119 -0.989834 0.995035 -0.096976 0.020163 0.008368
..............................................
@DRAhmadFaraz Sorry about that. I have no idea about SFM. Maybe you can try some visual SLAM algorithms.
Dear @slamjie
Hope you are fine, I am still working on this Pose-NET approach, I want to ask one important factor from you, Hope you will help me out.
I have trained successfully this Pose-NET approach successfully on my own dataset, now for further calculations, I want to extract the "Predicted Poses" for every each corresponding image.
Predicted poses includes [ 7 X 1 ] matrix having 3 values of Translation and 4 values of Rotation quaternion
I have checked the code posenet/scripts/test_posenet.py
and exclude the predicted poses but for these poses are calculated for 1 iteration but I need predicted poses for every corresponding images.
Can you please help me to sort it out.? I will be thankful to you.
@DRAhmadFaraz
You should change the inputs of train_posenet.prototxt
like this:
input: "data"
input_shape {
dim: 1
dim: 3
dim: 224
dim: 224
}
And write a test script to test the image. There are some demos in caffe. It's easy to get the code about how to send images to net in caffe.
The predicted poses are saved like this:
predicted_q = net.blobs['cls3_fc_wpqr'].data
predicted_x = net.blobs['cls3_fc_xyz'].data
Hope these may help you . : )
Dear @slamjie
Hope you are fine, I am still working on this Pose-NET approach, I want to ask one important factor from you, Hope you will help me out.
I have trained successfully this Pose-NET approach successfully on my own dataset, now for further calculations, I want to extract the "Predicted Poses" for every each corresponding image.
Predicted poses includes [ 7 X 1 ] matrix having 3 values of Translation and 4 values of Rotation quaternion
I have checked the code
posenet/scripts/test_posenet.py
and exclude the predicted poses but for these poses are calculated for 1 iteration but I need predicted poses for every corresponding images.Can you please help me to sort it out.? I will be thankful to you.
Hello @DRAhmadFaraz , can you please tell me how did you generate the train.txt file, Thanks.
Sorry for bothering. I want to use PoseNet in my robot indoors , but when I collect 1000 images, I don't know how to set the arguments in the solver_posenet.prototxt.
My prototxt like this :
but when I begin to train ,
I1212 12:48:09.793629 8576 solver.cpp:237] Iteration 0, loss = 584.759
the loss begin at 584.759 and always stay at 125. Does it mean PoseNet train successfully ? Would you like to share the solver_posenet.prototxt to me? I would appreciate if you give me some advises about possible mistakes that I might make.