geoffzhang commented 5 years ago

Hi，I am interested in the research and I have a question. The deepprior++ can run about 30 fps with GTX980Ti, but it is hard to run on mobile device in realtime. Did you try to use mobilenet or other light network ?

moberweger commented 5 years ago

I never tried on a mobile device, but I am confident that it can run in realtime on modern devices, such as iPhone X. If you are concerned about performance, you can try of course MobileNet or similar architectures by simply retraining the network.

geoffzhang commented 5 years ago

@moberweger Thanks for your reply! I want to try it. But I got a error when I run test_realtimepipeline.py.

I used ICVL datasets. The modified part is as follows:

di=MSRA15Importer('/home/geoff/workspace/github/HandPointnet/data/cvpr15_MSRAHandGestureD/')

Seq2 = di.loadSequence('P0')

testSeqs = [Seq2]

di = ICVLImporter('/home/geoff/workspace/datesets/ICVL/')
Seq2 = di.loadSequence('test_seq_1')
testSeqs = [Seq2]

#di = NYUImporter('../data/NYU/')
#Seq2 = di.loadSequence('test_1')
#testSeqs = [Seq2]

# load trained network
poseNetParams = ResNetParams(type=1, nChan=1, wIn=128, hIn=128, batchSize=1, numJoints=16, nDims=3)
poseNetParams.loadFile = "./eval/ICVL_network_prior.pkl"
comrefNetParams = ScaleNetParams(type=1, nChan=1, wIn=128, hIn=128, batchSize=1, resizeFactor=2, numJoints=16, nDims=3)
comrefNetParams.loadFile = "./eval/net_ICVL_COM_AUGMENT.pkl"
#config = {'fx': 588., 'fy': 587., 'cube': (300, 300, 300)}
#config = {'fx': 241.42, 'fy': 241.42, 'cube': (250, 250, 250)}
config = {'fx': 224.5, 'fy': 230.5, 'cube': (300, 300, 300)}  # Creative Gesture Camera

### The error is as follows: geoff@geoff-Veriton-D630:~/workspace/github/deep-prior-pp/src$ python test_realtimepipeline.py Loading cache data from ./cache//ICVLImporter_test_seq_1_None_gt_250_cache.pkl Create producer process... Create consumer process... /home/geoff/workspace/github/deep-prior-pp/src/net/convpoollayer.py:261: UserWarning: DEPRECATION: the 'ds' parameter is not going to exist anymore as it is going to be replaced by the parameter 'ws'. pooled_out = pool_2d(input=conv_out, ds=poolsize, ignore_border=True, mode='max') /home/geoff/workspace/github/deep-prior-pp/src/net/convpoollayer.py:261: UserWarning: DEPRECATION: the 'ds' parameter is not going to exist anymore as it is going to be replaced by the parameter 'ws'. pooled_out = pool_2d(input=conv_out, ds=poolsize, ignore_border=True, mode='max') Loading model parameters from ./eval/ICVL_network_prior.pkl Possibly not matching network configuration! Differences are: Network configuration:

Layer 0: ConvPoolLayer with inputDim (64, 1, 128, 128), outputDim (64, 32, 64, 64), filterDim (5, 5), nFilters 32, activation None, stride (1, 1), border_mode same, hasBias True, pool_type 0, pool_size (2, 2)
Layer 0: ConvPoolLayer with inputDim (1, 1, 128, 128), outputDim (1, 32, 64, 64), filterDim (5, 5), nFilters 32, activation None, stride (1, 1), border_mode half, hasBias True, pool_type 0, pool_size (2, 2) Layer 1: BatchNormLayer with epsilon 0.0001, alpha 0.1
Layer 2: NonlinearityLayer with inputDim (64, 32, 64, 64), outputDim (64, 32, 64, 64), activation ReLU ? ^^ ^^
Layer 2: NonlinearityLayer with inputDim (1, 32, 64, 64), outputDim (1, 32, 64, 64), activation ReLU ? ^ ^
Layer 3: ConvLayer with inputDim (64, 32, 64, 64), outputDim (64, 16, 32, 32), filterDim (1, 1), nFilters 16, activation None, stride (2, 2), border_mode same, hasBias True ? ^^ ^^ ^ ^^
Layer 3: ConvLayer with inputDim (1, 32, 64, 64), outputDim (1, 16, 32, 32), filterDim (1, 1), nFilters 16, activation None, stride (2, 2), border_mode half, hasBias True ? ^ ^ ^ ^^

Layer 4: BatchNormLayer with epsilon 0.0001, alpha 0.1
Layer 5: NonlinearityLayer with inputDim (64, 16, 32, 32), outputDim (64, 16, 32, 32), activation ReLU ? ^^ ^^
Layer 5: NonlinearityLayer with inputDim (1, 16, 32, 32), outputDim (1, 16, 32, 32), activation ReLU ? ^ ^ ........
Layer 187: DropoutLayer with inputDim (64, 1024), outputDim (64, 1024), p 0.3, mode fc, mask rng
Layer 188: HiddenLayer with inputDim (64, 1024), outputDim (64, 1024), activiation ReLU, hasBias True ? ^ ^^ ^^
Layer 187: HiddenLayer with inputDim (1, 1024), outputDim (1, 1024), activiation ReLU, hasBias True ? ^ ^ ^
Layer 189: DropoutLayer with inputDim (64, 1024), outputDim (64, 1024), p 0.3, mode fc, mask rng
Layer 190: HiddenLayer with inputDim (64, 1024), outputDim (64, 30), activiation None, hasBias True ? ^^ ^^ ^^
Layer 188: HiddenLayer with inputDim (1, 1024), outputDim (1, 30), activiation None, hasBias True ? ^^ ^ ^
Layer 191: HiddenLayer with inputDim (64, 30), outputDim (64, 48), activiation None, hasBias True ? - ^^ ^^
Layer 189: HiddenLayer with inputDim (1, 30), outputDim (1, 48), activiation None, hasBias True ? + ^ ^

Warning: Layer parameters for layer 187 do not match. Trying to fit on shape! Process Process-3: Traceback (most recent call last): File "/home/geoff/anaconda2/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap self.run() File "/home/geoff/anaconda2/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, *self._kwargs) File "/home/geoff/workspace/github/deep-prior-pp/src/util/realtimehandposepipeline.py", line 181, in threadConsumer self.initNets() File "/home/geoff/workspace/github/deep-prior-pp/src/util/realtimehandposepipeline.py", line 121, in initNets self.poseNet = ResNet(numpy.random.RandomState(23455), cfgParams=self.poseNet) File "/home/geoff/workspace/github/deep-prior-pp/src/net/resnet.py", line 340, in init self.load(self.cfgParams.loadFile) File "/home/geoff/workspace/github/deep-prior-pp/src/net/netbase.py", line 463, in load raise ImportError("Could not load all necessary variables!") ImportError: Could not load all necessary variables! Warning: Layer parameters for layer 187 do not match. Trying to fit on shape! Process Process-2: Traceback (most recent call last): File "/home/geoff/anaconda2/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap self.run() File "/home/geoff/anaconda2/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(self._args, self._kwargs) File "/home/geoff/workspace/github/deep-prior-pp/src/util/realtimehandposepipeline.py", line 141, in threadProducer self.initNets() File "/home/geoff/workspace/github/deep-prior-pp/src/util/realtimehandposepipeline.py", line 121, in initNets self.poseNet = ResNet(numpy.random.RandomState(23455), cfgParams=self.poseNet) File "/home/geoff/workspace/github/deep-prior-pp/src/net/resnet.py", line 340, in init self.load(self.cfgParams.loadFile) File "/home/geoff/workspace/github/deep-prior-pp/src/net/netbase.py", line 463, in load raise ImportError("Could not load all necessary variables!") ImportError: Could not load all necessary variables!**

How can I solve this issue, please?

moberweger commented 5 years ago

Hi @geoffzhang There are 2 problems in the code you provided:

Use type=4: ResNetParams(type=4, nChan=1, wIn=128, hIn=128, batchSize=1, numJoints=16, nDims=3)
Change to numJoints=1: comrefNetParams = ScaleNetParams(type=1, nChan=1, wIn=128, hIn=128, batchSize=1, resizeFactor=2, numJoints=1, nDims=3)

geoffzhang commented 5 years ago

Hi, @moberweger, Thanks, it worked, but I have a new error: Traceback (most recent call last): File "/home/geoff/anaconda2/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap self.run() File "/home/geoff/anaconda2/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, **self._kwargs) File "/home/geoff/workspace/github/deep-prior-pp/src/util/realtimehandposepipeline.py", line 204, in threadConsumer img, poseimg = self.show(frm['frame'], pose, frm['com3D']) File "/home/geoff/workspace/github/deep-prior-pp/src/util/realtimehandposepipeline.py", line 440, in show comP = self.importer.joint3DToImg(rotatePoint3D(com3D, handpose[self.importer.crop_joint_idx], 0., 90., 0.)) File "/home/geoff/workspace/github/deep-prior-pp/src/data/transformations.py", line 135, in rotatePoint3D pp -= center TypeError: Cannot cast ufunc subtract output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

I found the 'pp -= center' in rotatePoint3D and printed pp, it indeed has an integer value: the code is as follow: def rotatePoint3D(p1, center, angle_x, angle_y, angle_z): """ Rotate a point in 3D around center :param p1: point in 3D (x,y,z) :param center: 3D center of rotation :param angle_x: angle around x-axis in deg :param angle_y: angle around y-axis in deg :param angle_z: angle around z-axis in deg :return: rotated point """ pp = p1.copy() print('pp', pp) pp -= center R = getRotationMatrix(angle_x, angle_y, angle_z) pr = numpy.array([pp[0], pp[1], pp[2], 1]) ps = numpy.dot(R, pr) ps = ps[0:3] / ps[3] ps += center return ps

the result is as follow: ('pp', array([ 23.41232109, 8.72331238, 356.07724762])) ('pp', array([ 13.74011517, 18.59737015, 357.57805634])) ('pp', array([ 14.26799297, 21.03929138, 404.63199615])) ('pp', array([ 18.27395058, 6.7862215 , 430.22183228])) ('pp', array([ 19.19467163, 19.09653664, 392.3207016 ])) ('pp', array([ 26.29939461, 22.06284523, 389.13811493])) ('pp', array([ 26.35639763, 25.79989624, 388.44070435])) ('pp', array([ 36.20342636, 12.64235306, 390.07084656])) ('pp', array([ 48.02437592, 20.44868279, 388.94287109])) ('pp', array([ 40.52606964, 29.0660038 , 387.12993622])) ('pp', array([ 47.5439682 , 6.90928984, 385.01878357])) ('pp', array([ 60.98585129, 13.59208488, 399.92012024])) ('pp', array([ 54.74921799, 16.02170372, 403.2693634 ])) ('pp', array([ 58.31529236, 2.35311437, 375.2861557 ])) ('pp', array([ 75.30374908, 12.18725109, 400.0931015 ])) ('pp', array([ 76.54815674, 12.58684063, 406.17704773])) ('pp', array([ 0, 0, 300]))

What is the meaning of '[ 0, 0, 300]'? And how can I solve it?

moberweger commented 5 years ago

It seems there is only one occurrence of [0,0,300], which is the default value. I changed that part. Please pull or fetch/merge and check if it is fixed.

geoffzhang commented 5 years ago

@moberweger Thanks, it worked.

moberweger / deep-prior-pp

Run the network on the mobile device？ #27

di=MSRA15Importer('/home/geoff/workspace/github/HandPointnet/data/cvpr15_MSRAHandGestureD/')

Seq2 = di.loadSequence('P0')

testSeqs = [Seq2]