marker_chunk_convert multi-GPU not work

I follow the readme run this code

export INFERENCE_RAM=80                                                   
export TORCH_DEVICE=cuda                                                   
MIN_LENGTH=6000 NUM_DEVICES=8 NUM_WORKERS=24 \    
marker_chunk_convert ./input_dir ./markdowns_output/

Console output, after running the above command:

Loaded detection model vikp/surya_det2 on device cuda with dtype torch.float16
Loaded detection model vikp/surya_layout2 on device cuda with dtype torch.float16
Loaded reading order model vikp/surya_order on device cuda with dtype torch.float16
Loaded recognition model vikp/surya_rec on device cuda with dtype torch.float16
Loaded texify model to cuda with torch.float16 dtype
Converting 80 pdfs in chunk 1/1 with 8 processes, and storing in ./markdowns_output

Processing PDFs:   0%|          | 0/80 [00:00<?, ?pdf/s]

run nvidia-smi , Only GPU 0 gets utilized (99%). The other 7 just have 3 MiB of memory usage, but no utilization and no processes are tied to them.

Mon Jun 10 13:17:53 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03   Driver Version: 470.161.03   CUDA Version: 12.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A800-SXM...  On   | 00000000:10:00.0 Off |                    0 |
| N/A   42C    P0   166W / 400W |  19249MiB / 81251MiB |     99%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A800-SXM...  On   | 00000000:16:00.0 Off |                    0 |
| N/A   32C    P0    63W / 400W |      3MiB / 81251MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA A800-SXM...  On   | 00000000:49:00.0 Off |                    0 |
| N/A   33C    P0    61W / 400W |      3MiB / 81251MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA A800-SXM...  On   | 00000000:4D:00.0 Off |                    0 |
| N/A   32C    P0    59W / 400W |      3MiB / 81251MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   4  NVIDIA A800-SXM...  On   | 00000000:89:00.0 Off |                    0 |
| N/A   33C    P0    63W / 400W |      3MiB / 81251MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   5  NVIDIA A800-SXM...  On   | 00000000:8E:00.0 Off |                    0 |
| N/A   33C    P0    64W / 400W |      3MiB / 81251MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   6  NVIDIA A800-SXM...  On   | 00000000:C5:00.0 Off |                    0 |
| N/A   31C    P0    60W / 400W |      3MiB / 81251MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   7  NVIDIA A800-SXM...  On   | 00000000:C9:00.0 Off |                    0 |
| N/A   34C    P0    64W / 400W |      3MiB / 81251MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

I also ref #136 , i use marker_chunk_convert, it not works.

VikParuchuri / marker

marker_chunk_convert multi-GPU not work #178