Open z-at-drigmo opened 10 months ago
Hello @z-at-drigmo,
Are you able to share a minimal reproduction of this error? We have seen a similar error in the past and the issue was ultimately related to duplicate output names. Without more information, it is hard to say if that is what is happening here though.
-Taylor
Hi Taylor,
Yes, I can reproduce it when tracing the image encoder part of Meta's Segment Anything model.
import torch
import torch_neuron
from transformers import SamModel
device = "cuda" if torch.cuda.is_available() else "cpu"
sam = SamModel.from_pretrained("facebook/sam-vit-huge").to(device)
model = sam.vision_encoder
model.eval()
dummy_inputs = {
"pixel_values": torch.randn(1, 3, 1024, 1024, dtype=torch.float)
}
traced_model = torch.neuron.trace(model, tuple(dummy_inputs.values()), separate_weights=True)
Let me know if you need any other information.
Hi,
I'm trying to trace the vision encoder part of Meta's Segment Anything Model (SAM), and I'm encountering several errors during the trace process but it seems to be stuck now.
The script doesn't seem to be consuming CPU or mem anymore, it just continues to output "..." (been >12 hrs now).
The following error seems to happen numerous times in the logs: