Closed monajalal closed 7 years ago
Hi Mona,
I will write a better description of VPilot in the README.md so people can understand how VPilot works.
Thanks!
So I am using Python 2 (is that ok?) for running drive.py and I get this expected error, however, how does the weight file gets created using DeepGTAV? Also you said you used TCP for connecting your Linux machine to Windows machine but what software/API did you use for so? I am a little lost in the whole connection, also what is width and height you are expecting?
mona@pascal:~/computer_vision/VPilot$ python drive.py
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so.5.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so.8.0 locally
usage: drive.py [-h] weights port width height
drive.py: error: too few arguments
Hi Mona,
When running drive.py you should set the width and height to the imageWidth and imageHeight you configured in the config.ini file of DeepGTAV. The port to the port configured in config.ini also.
The weights file should be generated previously by train.py, but I will upload a dummy version between today and tomorrow so you can use it directly.
Thanks!
Hi Aitor,
I used your weight and got this error. It is kind of vague how to fix it. Can you please guide?
mona@pascal:~/computer_vision/VPilot$ hdfview
mona@pascal:~/computer_vision/VPilot$ python drive.py model.h5 8000 320 160
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so.5.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so.8.0 locally
/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py:368: UserWarning: The `regularizers` property of layers/models is deprecated. Regularization losses are now managed via the `losses` layer/model property.
warnings.warn('The `regularizers` property of '
Traceback (most recent call last):
File "drive.py", line 61, in <module>
model = aitorNet.getModel(weights_path=args.weights)
File "/home/mona/computer_vision/VPilot/model.py", line 106, in getModel
model.load_weights(weights_path)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2701, in load_weights
self.load_weights_from_hdf5_group(f)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2787, in load_weights_from_hdf5_group
K.batch_set_value(weight_value_tuples)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 1544, in batch_set_value
assign_op = x.assign(assign_placeholder)
File "/home/mona/tensorflow/_python_build/tensorflow/python/ops/variables.py", line 505, in assign
return state_ops.assign(self._variable, value, use_locking=use_locking)
File "/home/mona/tensorflow/_python_build/tensorflow/python/ops/gen_state_ops.py", line 45, in assign
use_locking=use_locking, name=name)
File "/home/mona/tensorflow/_python_build/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/home/mona/tensorflow/_python_build/tensorflow/python/framework/ops.py", line 2390, in create_op
set_shapes_for_outputs(ret)
File "/home/mona/tensorflow/_python_build/tensorflow/python/framework/ops.py", line 1785, in set_shapes_for_outputs
shapes = shape_func(op)
File "/home/mona/tensorflow/_python_build/tensorflow/python/framework/common_shapes.py", line 596, in call_cpp_shape_fn
raise ValueError(err.message)
ValueError: Dimension 1 in both shapes must be equal, but are 3 and 2
Hi Mona,
The weights were generated for nanoAitorNet, so put that instead of AitorNet
I didn't change the code, it is using nanoAitorNet
in drive.py
(please check lines 60 and 61):
52 if __name__ == '__main__':
53 parser = argparse.ArgumentParser(description='Remote Driving')
54 parser.add_argument('weights', type=str, help='Path to model weights')
55 parser.add_argument('port', type=int, help='Port to listen to')
56 parser.add_argument('width', type=int, help='Width of the image to receive')
57 parser.add_argument('height', type=int, help='Height of the image to receive')
58 args = parser.parse_args()
59
60 aitorNet = nanoAitorNet()
61 model = aitorNet.getModel(weights_path=args.weights)
62 x = np.zeros((50, args.height, args.width, 3), dtype='float32')
63
64 server = Server(port=args.port, image_size=(args.width, args.height))
65 while 1:
66 img = server.recvImage()
67 if (img == None): break
68 #plt.imshow(img.astype('uint8'))
69 #plt.show()
70 x = np.roll(x,-1, axis=0)
71 x[-1] = img
72
Please ignore what I said as I saw you had updated your code, however, I am not sure why the server connection gets stuck. Can you please guide with that?
mona@pascal:~/computer_vision/VPilot$ git pull
remote: Counting objects: 13, done.
remote: Total 13 (delta 4), reused 4 (delta 4), pack-reused 9
Unpacking objects: 100% (13/13), done.
From https://github.com/ai-tor/VPilot
34f0c4c..52c9d20 master -> origin/master
Updating 34f0c4c..52c9d20
error: Your local changes to the following files would be overwritten by merge:
drive.py
Please, commit your changes or stash them before you can merge.
Aborting
mona@pascal:~/computer_vision/VPilot$ git stash
Saved working directory and index state WIP on master: 34f0c4c Improved overall model
HEAD is now at 34f0c4c Improved overall model
mona@pascal:~/computer_vision/VPilot$ git pull
Updating 34f0c4c..52c9d20
Fast-forward
README.md | 8 +++++---
drive.py | 16 ++++------------
model.py | 8 ++++----
train.py | 12 +++---------
4 files changed, 16 insertions(+), 28 deletions(-)
mona@pascal:~/computer_vision/VPilot$ ls
drive.py model.h5 model.py model.pyc __pycache__ README.md train.py
mona@pascal:~/computer_vision/VPilot$ python drive.py /home/mona/computer_vision/VPilot/model.h5 8000 320 160
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so.5.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so.8.0 locally
/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py:368: UserWarning: The `regularizers` property of layers/models is deprecated. Regularization losses are now managed via the `losses` layer/model property.
warnings.warn('The `regularizers` property of '
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: Tesla K40c
major: 3 minor: 5 memoryClockRate (GHz) 0.8755
pciBusID 0000:03:00.0
Total memory: 11.92GiB
Free memory: 11.85GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x377a740
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 1 with properties:
name: Tesla K40c
major: 3 minor: 5 memoryClockRate (GHz) 0.8755
pciBusID 0000:83:00.0
Total memory: 11.92GiB
Free memory: 11.85GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:855] cannot enable peer access from device ordinal 0 to device ordinal 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:855] cannot enable peer access from device ordinal 1 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y N
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 1: N Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla K40c, pci bus id: 0000:83:00.0)
Started server
^CTraceback (most recent call last):
File "drive.py", line 58, in <module>
server = Server(port=args.port, image_size=(args.width, args.height))
File "drive.py", line 16, in __init__
self.conn, self.addr = self.s.accept()
File "/usr/lib/python2.7/socket.py", line 202, in accept
sock, addr = self._sock.accept()
KeyboardInterrupt
and I have:
mona@pascal:~/computer_vision/VPilot$ netstat -an | grep 8000
tcp 0 0 144.92.237.238:8000 0.0.0.0:* LISTEN
Hi Ai-tor, I was able to connect the V-Pilot with GTAV. However, it crashes with the following error. Do you know what could be wrong?
D:\My_Projects\VPilot-master>python drive.py model.h5 8000 200 66
Using TensorFlow backend.
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stre
am_executor\dso_loader.cc:128] successfully opened CUDA library cublas64_80.dll
locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stre
am_executor\dso_loader.cc:119] Couldn't open CUDA library cudnn64_5.dll
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stre
am_executor\cuda\cuda_dnn.cc:3459] Unable to load cuDNN DSO
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stre
am_executor\dso_loader.cc:128] successfully opened CUDA library cufft64_80.dll l
ocally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stre
am_executor\dso_loader.cc:128] successfully opened CUDA library nvcuda.dll local
ly
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stre
am_executor\dso_loader.cc:128] successfully opened CUDA library curand64_80.dll
locally
C:\Anaconda3\lib\site-packages\keras\engine\topology.py:368: UserWarning: The r egularizers
property of layers/models is deprecated. Regularization losses are
now managed via the losses
layer/model property.
warnings.warn('The regularizers
property of '
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core
\common_runtime\gpu\gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX 960
major: 5 minor: 2 memoryClockRate (GHz) 1.2405
pciBusID 0000:02:00.0
Total memory: 2.00GiB
Free memory: 1.80GiB
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core
\common_runtime\gpu\gpu_device.cc:906] DMA: 0
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core
\common_runtime\gpu\gpu_device.cc:916] 0: Y
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core
\common_runtime\gpu\gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (d
evice: 0, name: GeForce GTX 960, pci bus id: 0000:02:00.0)
Started server
GTAV connected
drive.py:61: FutureWarning: comparison to None
will result in an elementwise o
bject comparison in the future.
if (img == None): break
Traceback (most recent call last):
File "drive.py", line 65, in
I am running both Vpilot and GTAV on the same windows pc. I first start Vpilot first and then launch the GTAV.
I was able to figure out the problem. Your dummy weights expects image dimensions to be 320x160. That got it to work. However, I realized that running both vpilot and gtav on the same machine with a single gpu card may not be feasible after all.
Does the dummy weights allow the car to drive on its own? I find that it's not driving the car. Do you have some trained weights that would work? Also, have you shared your training data anywhere? If yes, could you please let me know where.
@prshnthrv just saying your GPU RAM is really low for DL applications
@monajalal, yes. That's right. Its not enough for both two run in parallel. However, does the dummy weights provided above allow the car to drive itself? Or is it really a dummy :)
@prshnthrv Yeah, I am working to provide "not dummy" weights, it takes a time to train such a network, if this training session works I may have good weights by half next week.
@monajalal Looks like DeepGTAV never reaches to connect, please check this things and try again, also make sure you have the last version of both projects:
Hi @ai-tor
my Windows 10 machine IP is: 144.92.237.225
my Ubuntu 14.04 IP is: 144.92.237.238
I can ping Windows from Ubuntu and they are both in the same private network connected using a switch.
After I start the game and run this command, nothing happens, basically, the car doesn't move. Can you please guide what is missing and show the output of your command?
mona@pascal:~/computer_vision/VPilot$ python drive.py /home/mona/computer_vision/VPilot/model.h5 8000 320 160
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so.5.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so.8.0 locally
/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py:368: UserWarning: The `regularizers` property of layers/models is deprecated. Regularization losses are now managed via the `losses` layer/model property.
warnings.warn('The `regularizers` property of '
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: Tesla K40c
major: 3 minor: 5 memoryClockRate (GHz) 0.8755
pciBusID 0000:03:00.0
Total memory: 11.92GiB
Free memory: 11.85GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x4260a30
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 1 with properties:
name: Tesla K40c
major: 3 minor: 5 memoryClockRate (GHz) 0.8755
pciBusID 0000:83:00.0
Total memory: 11.92GiB
Free memory: 11.85GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:855] cannot enable peer access from device ordinal 0 to device ordinal 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:855] cannot enable peer access from device ordinal 1 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y N
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 1: N Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla K40c, pci bus id: 0000:83:00.0)
Started server
mona@pascal:~$ ping 144.92.237.225
PING 144.92.237.225 (144.92.237.225) 56(84) bytes of data.
64 bytes from 144.92.237.225: icmp_seq=1 ttl=128 time=4.80 ms
64 bytes from 144.92.237.225: icmp_seq=2 ttl=128 time=0.329 ms
64 bytes from 144.92.237.225: icmp_seq=3 ttl=128 time=0.340 ms
Here's my config.ini file in the GTAV directory setup for reinforcement learning:
[common]
mode=1
imageWidth=320
imageHeight=160
car=0
weatherChangeDelay=1000
initialWeather=-1
initialHour=-1
initialMinute=-1
initialPosX=-1
initialPosY=-1
maxDuration=2
[supervised]
setSpeed=15.0
drivingStyle=0
captureFreq=10
datasetDir=D:\DeepGTAV_dataset3\
[reinforcement]
reward=1
desiredSpeed=15.0
desiredAgressivity=0.5
host=144.92.237.238
port=8000
so I did it the other way around, first ran the VPilot server and then started the game but seems I found a bug:
mona@pascal:~/computer_vision/VPilot$ python drive.py /home/mona/computer_vision/VPilot/model.h5 8000 320 160
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so.5.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so.8.0 locally
/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py:368: UserWarning: The `regularizers` property of layers/models is deprecated. Regularization losses are now managed via the `losses` layer/model property.
warnings.warn('The `regularizers` property of '
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: Tesla K40c
major: 3 minor: 5 memoryClockRate (GHz) 0.8755
pciBusID 0000:03:00.0
Total memory: 11.92GiB
Free memory: 11.85GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x371ea30
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 1 with properties:
name: Tesla K40c
major: 3 minor: 5 memoryClockRate (GHz) 0.8755
pciBusID 0000:83:00.0
Total memory: 11.92GiB
Free memory: 11.85GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:855] cannot enable peer access from device ordinal 0 to device ordinal 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:855] cannot enable peer access from device ordinal 1 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y N
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 1: N Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla K40c, pci bus id: 0000:83:00.0)
Started server
GTAV connected
drive.py:62: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future.
if (img == None): break
Traceback (most recent call last):
File "drive.py", line 67, in <module>
server.sendCommands(commands[0,0], commands[0,1])
File "drive.py", line 31, in sendCommands
self.conn.sendall(data.tobytes())
AttributeError: 'array.array' object has no attribute 'tobytes'
[1]+ Killed python drive.py /home/mona/computer_vision/VPilot/model.h5 8000 320 160
*I am working on it.
Hi Mona,
It may be that you are using Python 2 instead of Python 3, is this the case?
Is there a fix/wraparound to get this fixed for Python2? my OpenCV doesn't agree to be installed for Python3 and CUDA8 :-1:
For Python 2 you could try something like
import struct
self.conn.sendall(struct.pack('%sf' % len(data), *data))
so it worked thanks to your help but my understanding was that it would move automatically. However it didn't. It showed a bunch of these rewards and then I used keyboard to move the car and then when I stopped it stopped showing me awards. So should I expect to see the car moving? What should I expect to see and why actually car isn't moving using the reinforcement learning server in VPilot?
Started server
GTAV connected
drive.py:64: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future.
if (img == None): break
('Sent commands', array('f', [0.12438786029815674, -0.07565033435821533]))
Received reward
-0.00952684879303
('Sent commands', array('f', [0.12438786029815674, -0.07565033435821533]))
Received reward
Hi Mona,
The weights I provided aren't still able to drive the vehicle, they are "dummy". I am doing several tests to see if I can train a good model, but still not results, probably by the end of this week I should have something "smart".
I see the throttle value is quite low, maybe you can try to add something like 0.3 to it, and the vehicle should start moving, but don't expect it to take the curves :')
Thank you! do you think using train_0000.h5 from here would make sense? https://drive.google.com/drive/u/0/folders/0B2UgaM91sqeAWGZVaDdmaGs2cmM also how do you set the throttle to 0.3? I didn't follow how.
@monajalal *.h5
is a file type. It is not applicable to everything. Read the code drive.py
carefully and everything is quite plain. @ai-tor has done a perfect job writing the server for you.
You cannot just say you did not follow all the time. Please do some research.
Hi @monajalal,
The weights you are trying to use were trained for an AlexNet CNN, which is clearly a different architecture than "nanoAitorNet" so they will just crash Keras.
When calling sendCommands you can do something like this to add some more throttle:
server.sendCommands(commands[0,0] + 0.3, commands[0,1])
I agree with @wang3303, it is important you look for your own answers, you will learn much more this way :)
Can you please explain how to use the following files (in a tutorial maybe IDK), alongside with DeepGTAV in order to drive the car automatically?
I got that I should have Keras on top of TensorFlow for Python 3 but I was lost what to do next.
Thank you!