Closed justachetan closed 8 months ago
Hi, the shape for queries is (B, N, 3) with B the batch size and N the number of points. The three channels are in the format (t, y, x), t is the time step of each query, and (x, y) is the position, given in pixels with the following orientation:
+----------> X |
---|
v Y
Thanks for the quick response! I tried running this by editing the function call in demo.py as follows:
pred = model({"video": video[None], "query_points": torch.Tensor([[[t, y, x]]]).cuda()}, mode="tracks_for_queries", **vars(args))
However I got the following error:
Traceback (most recent call last):
File "work/dot/demo2.py", line 312, in <module> main(args)
File "work/dot/demo2.py", line 304, in main
data["tracks"] = data["tracks"].permute(0, 2, 1, 3)
RuntimeError: permute(sparse_coo): number of dimensions in the tensor input does not match the length of the desired ordering of dimensions i.e. input.dim() = 3 is not equal to len(dims) = 4
Could you kindly advise how to fix this? Thanks!
Hi! You get this error because sparse tracks and dense tracks do not have the same shape:
[B T H W 3]
(with B
: batch size, T
: time steps, H
: height, W
: width)[B T N 3]
(with N
the number of tracks).The demo was written to handle dense tracks. You may try to hack it by adding another dimension -> [B T N 1 3]
.
The plotting functions in the demo can now handle tracks in both [B T H W 3]
and [B T N 3]
format. So there is no need for a hack anymore.
Hi! Thank you for releasing the code and models publicly! I am trying to use the model to perform inference on my own videos. For visualization, I want to focus on a few selected query points.
The
tracks_for_queries
mode here seems to be what I need. However, I cannot figure out the required format of thequery_points
. Could you kindly provide some information about the same?Thanks!