Elsaam2y / DINet_optimized

An optimized pipeline for DINet reducing inference latency for up to 60% 🚀. Kudos for the authors of the original repo for this amazing work.
93 stars 15 forks source link

docker instance status Exited #10

Closed davidmartinrius closed 9 months ago

davidmartinrius commented 9 months ago

Hello,

I am running inference with docker. I would like to keep the docker instance running, to do something like docker exec whatever

Every time I run (without --rm) docker run --gpus 'device=0' -v $PWD:/app dinet python3 inference.py etc... it starts a new instance, downloads https://download.pytorch.org/torchaudio/models/hubert_fairseq_large_ll60k_asr_ls960.pth" to /root/.cache/torch/hub/checkpoints/hubert_fairseq_large_ll60k_asr_ls960.pth

I also tried docker run -d --gpus 'device=0' -v $PWD:/app dinet Without any other parameter, in this way it should not execute any command but after executing that line it exits.

Please, could you tell me how to keep docker instance running? I would like to have it as an available service to run multiple inferences in different times.

Thank you!

Elsaam2y commented 9 months ago

Hi,

If I understand you correctly, you want to avoid downloading the wav2vec model each time you run a new instance, right? The problem is the once you build your image and then start your container, it will have each time to download the wav2vec model. If this is taking some time which you prefer to avoid, you can try one of the following:

1- Download the wav2vec model outside the container then mount the model tmp directly inside your docker container using -v {/tmp_directory}:/root/.cache/torch/hub/checkpoints/hubert_fairseq_large_ll60k_asr_ls960.pth. But I didn't try it myself I have to say.

2- You can try running he dock command as follows: docker run --rm --gpus 'device=0' -v $PWD:/app dinet bash. This will start a bash shell which allows you to interact with the container's filesystem and then you can execute the python scripts inside it. For the first time it will download the model, but then each time you run a new inference it won't download it.

Let me know if you had any issues.

davidmartinrius commented 9 months ago

Hi,

If I understand you correctly, you want to avoid downloading the wav2vec model each time you run a new instance, right? The problem is the once you build your image and then start your container, it will have each time to download the wav2vec model. If this is taking some time which you prefer to avoid, you can try one of the following:

1- Download the wav2vec model outside the container then mount the model tmp directly inside your docker container using -v {/tmp_directory}:/root/.cache/torch/hub/checkpoints/hubert_fairseq_large_ll60k_asr_ls960.pth. But I didn't try it myself I have to say.

2- You can try running he dock command as follows: docker run --rm --gpus 'device=0' -v $PWD:/app dinet bash. This will start a bash shell which allows you to interact with the container's filesystem and then you can execute the python scripts inside it. For the first time it will download the model, but then each time you run a new inference it won't download it.

Let me know if you had any issues.

Yes, you understood me perfectly. Absolutely, this is a valid solution for the problem I mentioned.

Actually, I wanted to have a live inferencer instance. To inference as fast as possible without loading the model into memory every time I wanted to inference.

Finally, I stopped using docker. I converted the model to torch script and deployed to a local triton server. I also modified the inference.py to use tritonclient with grpc. I had to do all of this without docker.

Thank you,

David