lewes6369 / TensorRT-Yolov3

TensorRT for Yolov3
MIT License
489 stars 165 forks source link

Multiple GPU #63

Open guods opened 5 years ago

guods commented 5 years ago

Thank you for your work, but I have some questions: How does the generated engine run on multiple graphics cards? How to set GPU Id number ?

lewes6369 commented 5 years ago

It should not be the bottleneck for generating engine.You can save engine only for the first time, and latter you can load from engine file.

guods commented 5 years ago

It is not be the bottleneck for generating engine。After create the engine file, I want to run the engine file on the specified GPU,so I set GPU ID by "cudaSetDevice", but it did not work.

guods commented 5 years ago

Have you ever done this experiment:For graphics cards with the same architecture, engine files generated under low-profile graphics cards (1060)be used under high-profile graphics cards(1080)?

zerollzeng commented 4 years ago

hi @guods, it nice to see you again :) for your first question: Each ICudaEngine object is bound to a specific GPU when it is instantiated, either by the builder or on deserialization. To select the GPU, use cudaSetDevice() before calling the builder or deserializing the engine. Each IExecutionContext is bound to the same GPU as the engine from which it was created. When calling execute() or enqueue(), ensure that the thread is associated with the correct device by calling cudaSetDevice() if necessary.

and for the second question: I recommend that you don’t, however, if you do, you’ll need to follow these guidelines: The major, minor, and patch versions of TensorRT must match between systems. This ensures you are picking kernels that are still present and have not undergone certain optimizations or bug fixes that would change their behavior. The CUDA compute capability major and minor versions must match between systems. This ensures that the same hardware features are present so the kernel will not fail to execute. An example would be mixing cards with different precision capabilities. The following properties should match between systems: Maximum GPU graphics clock speed Maximum GPU memory clock speed GPU memory bus width Total GPU memory GPU L2 cache size SM processor count Asynchronous engine count If any of the above properties do not match, you will receive the following warning: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.

guods commented 4 years ago

Thanks for you reply. I also read the words in TensorRT document, although the output value of cudaSetDevice()(before create engine) is error, it create the engine and get the correct result, it suggested that it is no use by cudaSetDevice() .

zerollzeng commented 4 years ago

It should not return error, emmm... what kind of error did you get?

guods commented 4 years ago

I make the error deliberately, I want to know if the engine file is still generated properly even if I set it incorrectly. I set it incorrectly and the file is still generated properly. For ctreating the engine, it is no use by cudaSetDeviece.

lewes6369 commented 4 years ago

I am not sure your issues. As @zerollzeng said, the engine is not generic for different architecture cards. Maybe just try to set CUDA_VISIBLE_DEVICES to the value which graphic card you want to create engine and deploy.