Closed Kiran-valetcloset closed 3 months ago
hey, sorry for the delayed response. The warning you're encountering is due to the model's design, not an issue with your environment. The warning arises during the tracing operation of the model's computation graph. Here's an example snippet from the code:
# segment_anything/modeling/image_encoder.py : line 254
B, H, W, C = x.shape
pad_h = (window_size - H % window_size) % window_size
pad_w = (window_size - W % window_size) % window_size
if pad_h > 0 or pad_w > 0:
x = F.pad(x, (0, 0, 0, pad_w, 0, pad_h))
Hp, Wp = H + pad_h, W + pad_w
The warning indicates a conditional branching (the if statement) which can affect the data flow during the tracing process. As the message explains, "We can't record the data flow of Python values, so this value will be treated as a constant in the future."
This behavior is generally safe to ignore, but please ensure that the inputs to the model match the settings used during its conversion. You can read more about this issue on Stack Overflow.
Please let me know if you have any more questions or need further assistance!
Best, Itay
Hi @ItayElam, thanks for your reply. I understand it's a warning for the tracing operation, but the process gets killed just after that and the actual conversion does not take place. As you can see above in the warning, the last word is killed. Is there any way I could complete the conversion and get the engine files?
My apologies for missing that detail in your previous message.
Could you please check how much CPU and GPU memory you have? An OOM issue might cause the process to be killed unexpectedly.
you can use the dmesg
command to check for any memory-related errors in the system logs. This might provide insight into whether the termination is due to memory constraints.
keep me updated!
thanks.
My system has 16GB RAM and a 3070 GPU with 8GB VRAM. During the engine file generation, just before it crashes, the RAM usage goes up to 90%. I ran your docker on a server with a 16GB RAM and T4 GPU with 16GB VRAM and I was able to complete the process and generate engine files without killing the server.
That is great to hear, is there anything else I can assist you with?
@ItayElam Thats it, thank you for your help and this amazing work! Closing the issue.
I'm trying to run the code on WSL ubuntu using the provided docker file. The docker built went fine and when I try to run the main script to convert the vit_h model to TensorRT engine using the provided command, I get this error:
root@LAPTOP-QRFHIVHV:/workspace# python3 main.py export --model_path pth_model/sam_vit_h_4b8939.pth --model_precision fp16 /usr/local/lib/python3.8/dist-packages/segment_anything/modeling/image_encoder.py:258: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if pad_h > 0 or pad_w > 0: /usr/local/lib/python3.8/dist-packages/segment_anything/modeling/image_encoder.py:304: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! max_rel_dist = int(2 * max(q_size, k_size) - 1) /usr/local/lib/python3.8/dist-packages/segment_anything/modeling/image_encoder.py:304: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! max_rel_dist = int(2 * max(q_size, k_size) - 1) /usr/local/lib/python3.8/dist-packages/segment_anything/modeling/image_encoder.py:306: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if rel_pos.shape[0] != max_rel_dist: /usr/local/lib/python3.8/dist-packages/segment_anything/modeling/image_encoder.py:318: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! q_coords = torch.arange(q_size)[:, None] * max(k_size / q_size, 1.0) /usr/local/lib/python3.8/dist-packages/segment_anything/modeling/image_encoder.py:319: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! k_coords = torch.arange(k_size)[None, :] * max(q_size / k_size, 1.0) /usr/local/lib/python3.8/dist-packages/segment_anything/modeling/image_encoder.py:320: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! relative_coords = (q_coords - k_coords) + (k_size - 1) * max(q_size / k_size, 1.0) /usr/local/lib/python3.8/dist-packages/segment_anything/modeling/image_encoder.py:287: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if Hp > H or Wp > W: Killed
By default the docker installed the following versions of libraries:
onnx == 1.16.1 onnxruntime-gpu == 1.18.0 opencv-python == 4.10.0.84 pycuda == 2022.2.2 segment-anything == 1.0 torch == 2.3.1 torchvision == 0.18.1
I also tried changing the torch version to these:
torch == 2.3.0+cu121 torchaudio == 2.3.0+cu121 torchvision == 0.18.0+cu121
But it did not fix the issue. I also tried to do all the installation without docker and got the same error as well.
Any help is appreciated. Thanks.