Open Dayananda-Akaike-Tech opened 1 year ago
Take a look at our CPU Inference workflow for an idea of how to setup your environment for CPU inference. For example, you will need to build and install the intel-extension-for-pytorch: https://github.com/microsoft/DeepSpeed/blob/5e16eb2c939707d0d0062a458d77998fccb3afad/.github/workflows/cpu-inference.yml#L25
I installed the necessary packages and libraries from the cpu workflow yml file and tried to perform an inference again by using the command, but got an error again, kindly help me to solve the issue
!deepspeed "/content/vistaar/transcribe.py" "/content/manifest.json" "/content/model_folder/" "Hindi" 1 "/content/output_path.txt".
2023-08-23 13:00:11,110 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmp5ptsxdl0
2023-08-23 13:00:11,111 - torch.distributed.nn.jit.instantiator - INFO - Writing /tmp/tmp5ptsxdl0/_remote_module_non_scriptable.py
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
[2023-08-23 13:00:14,466] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cpu (auto detect)
2023-08-23 13:00:18,374 - numexpr.utils - INFO - NumExpr defaulting to 2 threads.
2023-08-23 13:00:21.267462: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[2023-08-23 13:00:22,427] [WARNING] [runner.py:201:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only.
[2023-08-23 13:00:22,429] [INFO] [runner.py:567:main] cmd = /usr/bin/python3 -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMF19 --master_addr=127.0.0.1 --master_port=29500 --enable_each_rank_log=None /content/vistaar/transcribe.py /content/manifest.json /content/drive/MyDrive/Indic_Whisper (Vistar Bench_mark)/whisper-medium-hi_alldata_multigpu/ Hindi 1 /content/output_path.txt
2023-08-23 13:00:27,818 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmphi1k5nd4
2023-08-23 13:00:27,819 - torch.distributed.nn.jit.instantiator - INFO - Writing /tmp/tmphi1k5nd4/_remote_module_non_scriptable.py
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
[2023-08-23 13:00:30,811] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cpu (auto detect)
2023-08-23 13:00:37,253 - numexpr.utils - INFO - NumExpr defaulting to 2 threads.
2023-08-23 13:00:40.022821: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[2023-08-23 13:00:41,016] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.15.5-1+cuda11.8
[2023-08-23 13:00:41,017] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_DEV_PACKAGE_VERSION=2.15.5-1
[2023-08-23 13:00:41,017] [INFO] [launch.py:138:main] 0 NCCL_VERSION=2.15.5-1
[2023-08-23 13:00:41,017] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev
[2023-08-23 13:00:41,017] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_PACKAGE=libnccl2=2.15.5-1+cuda11.8
[2023-08-23 13:00:41,017] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_PACKAGE_NAME=libnccl2
[2023-08-23 13:00:41,017] [INFO] [launch.py:138:main] 0 NV_LIBNCCL_PACKAGE_VERSION=2.15.5-1
[2023-08-23 13:00:41,017] [INFO] [launch.py:145:main] WORLD INFO DICT: {'localhost': [0]}
[2023-08-23 13:00:41,017] [INFO] [launch.py:151:main] nnodes=1, num_local_procs=1, node_rank=0
[2023-08-23 13:00:41,017] [INFO] [launch.py:162:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0]})
[2023-08-23 13:00:41,017] [INFO] [launch.py:163:main] dist_world_size=1
[2023-08-23 13:00:41,017] [INFO] [launch.py:165:main] Setting CUDA_VISIBLE_DEVICES=0
2023-08-23 13:00:45.781653: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
My guessed rank = 0
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
@Dayananda-Akaike-Tech what does ds_report
output when you run it?
Also can you share the output of numactl --hardware
? Thanks
Describe the bug I am trying to run an inference on colab using only CPU
Using Deepspeed CMD for inference i am using the below command with my transcribe.py file and model_folder saved model folder
!deepspeed --include= localhost:0 "/content/vistaar/transcribe.py" "/content/manifest.json" "/content/model_folder/" "Hindi" 1 "/content/output_path.txt".
Required Output what should i specify the parameter for device to use only cpu for inference
"--include"
or"--exclude"
or install intel-extension-for-deepspeedExisting output