Closed v-fuchs closed 8 years ago
Since you are using Docker, why do you need to have CentOS inside the container too? You can have a CentOS host but run containers using ubuntu 14.04 or ubuntu 16.04.
Hi Felix, thank you very much for your answer.
I hadn't know that much about docker so I read some documents after your post and I am now totally convinced of this solution.
Unfortunately I'm now on vacation for one week so I can't try your suggested solution on our production servers at work. I'll try it immediately after my return and give you feedback.
For now I have one remaining question: Is it somehow possible to add more than one caffemodel to one inference_server (container) or do I have to create another container for every caffemodel? If it is possible...how would be the _curl_command look like? How do I switch between different models?
Thank you again for your fast reply and your highly appreciated help.
best regards Valentin
The best approach is probably to have one Docker container (i.e. one server) per model, it would keep the code simple. You will have multiple servers running on the same machine, and you just need to modify the port in your curl
command to use a different model.
Hi Felix,
thanks again for your help.
I have some more question to the _Dockerfile.inferenceserver:
1.: In the beginning of the script you define the following:
ENV CUDA_ARCH_BIN "30 35 50 52 60"
ENV CUDA_ARCH_PTX "60"
Maybe I'm wrong but is "60" not an invalid CUDA
compute capability? Refering to https://developer.nvidia.com/cuda-gpus shouldn't it be 61 instead of 60 for the new Pascal GPUs?
2.: Later in this script you mention that you are using an modified version of caffe
from your github repository. Can I also use caffe
from the NVIDIA
github repository with the same benefits when running GRE
by building it with your mentioned cmake
paramaters? Or may you be so kind and explain me briefly what modifications you did with your caffe
version?
I'm a little bit confused because you clone with your bvlc_inference
branch. Is it because you are building this container for bvlc caffenet
? Should I clone with another branch of your repository when using my own dataset respectively caffemodell or is it always the same branch?
3.: Same question similar to 2. Do I have to use opencv 3.0.0
or can I use 2.4.13 instead without any losses?
Thanks a lot for your very helpful advices
best regards Valentin
bvlc_inference
because the branch is based on BVLC/caffe and not NVIDIA/caffe. You could try with NVIDIA/caffe but last time I tried there was no performance difference for inference with batch size of 1. NVIDIA/caffe really shines for multi-GPU training with large batch sizes and complex networks.
I have a single patch on top of BVLC/caffe: https://github.com/flx42/caffe/commit/1a5187a259a5cb31fef0e091bfe4795b268b1238
This is useful because the go HTTP server creates many many threads, and we don't want each thread to create a caffe context on the GPUs, it will waste memory. This is a limitation of the Caffe design.
I think you don't strongly need this patch, you can try without, but you will need to modify the code a little bit, and make sure everything still works as expected.cudaMalloc/cudaFree
for preprocessing the images. But again, if you use Docker, you don't have to care about which version of OpenCV is used, each container can use its own version of OpenCV.Thank you very much for your detailed answer!
I could now succefully build and run the inference-server.
I have one remaining problem. The classification using
$ curl -XPOST --data-binary @images/1.jpg http://127.0.0.1:8000/api/classify
works only executed with superuser rights. Running without these rights I get the following error:
Access Denied (authentication_failed)
Your credentials could not be authenticated: "General authentication failure due to bad user ID or authentication token.". You will not be permitted access until your credentials can be verified. This is typically caused by an incorrect username and/or password, but could also be caused by network problems.
For assistance, contact your network support team.
Apparently some problems with the rights. How can I make the API accessible for everybody in the network or how to create user/pw with relevant rights for accessing the API?
I don't know how you ended up with that error, it certainly isn't coming from my code.
You need to check your network settings, and check that it's actually the inference server that is running on port 8000
Hi,
I would really like to use GRE in our projects but unfortunately we are stuck during the installation because we are using CentOS. Is there maybe a Dockerfile.inference_server for CentOS aswell? I tried to adapt the script for a centos environment but at the end the server couldn't start:
I would highly appreciate any kind of help. Thanks in advance! Valentin