It is a nice work. I also try to visualize the model architecture with tensorboard. But when i pass the image and 'TEXT_PROMPT' after the preprocess_caption function, the error comes "RuntimeError: Type 'Tuple[Tensor, List[str]]' cannot be traced. The code block is as follows:
`import os
def preprocess_caption(caption: str) -> str:
result = caption.lower().strip()
if result.endswith("."):
return result
return result + "."
error comes "Only Tensors and (possibly nested) Lists, Dicts, and Tuples of Tensors can be traced".
Because the text and image backbone deal corresponding data in the beginning of Grounding DINO, How can i pass the input to the add_graph function? I also tried the netron.app with the .pth file, but it didn't have the link between the layers.
Could you please give some advise to solve the visualization problem? Thanks!
It is a nice work. I also try to visualize the model architecture with tensorboard. But when i pass the image and 'TEXT_PROMPT' after the preprocess_caption function, the error comes "RuntimeError: Type 'Tuple[Tensor, List[str]]' cannot be traced. The code block is as follows: `import os
def preprocess_caption(caption: str) -> str: result = caption.lower().strip() if result.endswith("."): return result return result + "."
CONFIG_PATH = os.path.join(HOME, "groundingdino/config/GroundingDINO_SwinT_OGC.py") print(CONFIG_PATH, "; exist:", os.path.isfile(CONFIG_PATH))
WEIGHTS_NAME = "groundingdino_swint_ogc.pth" WEIGHTS_PATH = os.path.join(HOME, "weights", WEIGHTS_NAME) print(WEIGHTS_PATH, "; exist:", os.path.isfile(WEIGHTS_PATH))
IMAGE_NAME = "dog-3.jpeg" IMAGE_PATH = os.path.join(HOME, "data", IMAGE_NAME)
TEXT_PROMPT = "chair" TEXT_PROMPT = preprocess_caption(TEXT_PROMPT) BOX_TRESHOLD = 0.35 TEXT_TRESHOLD = 0.25
model = load_model(CONFIG_PATH, WEIGHTS_PATH)
image_source, image = load_image(IMAGE_PATH) with torch.no_grad(): outputs = model(image[None], captions=[TEXT_PROMPT])
log_dir = "./logs" writer = SummaryWriter(log_dir)
writer.add_graph(model, input_to_model={"samples": image[None], "captions": TEXT_PROMPT})
writer.add_graph(model,(image[None], [TEXT_PROMPT])) writer.close()`
error comes "Only Tensors and (possibly nested) Lists, Dicts, and Tuples of Tensors can be traced". Because the text and image backbone deal corresponding data in the beginning of Grounding DINO, How can i pass the input to the add_graph function? I also tried the netron.app with the .pth file, but it didn't have the link between the layers. Could you please give some advise to solve the visualization problem? Thanks!