Closed Aeim closed 1 year ago
I implement it on ocean is ok(kind of DIMP), the original DIMP is like https://github.com/franktpmvu/NeighborTrack/blob/c889695427a2288b42e31cd0f9e0f7e509244729/trackers/ostrack/pytracking/tracker/dimp/dimp.py#L120 to get pos + score(L116)
So, you need use it (pos + scores) to get neighbors, and then, after NeighborTrack, use new pos + scores to update DIMP (Line 122 to Line 147)
The above is the practice of wrapping dimp in the outermost layer (step = backbone to NeighborTrack to DIMP)
To wrap neighbortrack in the outermost layer, only take the result of DIMP to update NeighborTrack (step = backbone to DIMP to NeighborTrack, I use this setting on Ocean.)
Thank you!
Can I see your implementation of ocean model?
Ocean implement is a old version of NeighborTrack code, there are some different in our new version. I upload the old function of it, you can see original ocean is like https://github.com/franktpmvu/NeighborTrack/blob/89aa0781c5b59ac570e3c1c47cca5b1dd6a5f945/trackers/example_ocean/test_ocean.py#L166
And then, w/ NeighborTrack are in https://github.com/franktpmvu/NeighborTrack/blob/89aa0781c5b59ac570e3c1c47cca5b1dd6a5f945/trackers/example_ocean/test_ocean.py#L280
online_tracker.neighbor_track in https://github.com/franktpmvu/NeighborTrack/blob/89aa0781c5b59ac570e3c1c47cca5b1dd6a5f945/trackers/example_ocean/online.py#L202
siam_tracker._neighbor_track in https://github.com/franktpmvu/NeighborTrack/blob/89aa0781c5b59ac570e3c1c47cca5b1dd6a5f945/trackers/example_ocean/ocean.py#L1073
This two functions are implement of online/offline Ocean version.
Thank you
So, to implement with your new version of NeighborTrack I need to implement 3 function
Yes, init and track_neighbor are very simple. DIMP is a post processing method, if backbone=A, DIMP=B, NeighborTrack=C, you can try ABC or ACB, my NeighborTrack_Ocean is style of ABC. i use both AB on forward and reverse track. Be careful update_center on DIMP, when use ABC style, first must use NeighborTrack to change answer, then update DIMP model weight.
Now, I got stuck! So I try to follow your guildline
to get pos + score(L116) So, you need use it (pos + scores) to get neighbors, and then, after NeighborTrack, use new pos + scores to update DIMP (Line 122 to Line 147)
I want to make a process ACB (Backbone -> Neighbor -> DIMP How can I get neightbors by pos + score (L116) because new pos is the original image size but the score map of DIMP is just 23x23? do you have any guildline for me?
as you can see, L119 self.localize_target input 23x23 pos and output 3 value:
L223 to L235 try to get maximum scores pos by L120 (sample_pos is like grid, translation_vec is its localize regression). We need all of 23x23 scores , so you will write a new version of localize_target() to get all translation_vec (23x23) and its final scores.
first write new function like localize_target_neighbor(): 1.copy all of localize_target 2.kill L226 to L229, we didn't need to get max index. 3.get all of 23x23 disp_index. 4.get all of 23x23 disp_index-score_center like L229 5.get all of 23x23 translation_vec and finally get the 23*23 new pos of original image size
Now you have 23*23 neighbors pos can go to next step.
3.get all of 23x23 disp_index. So, this mean I need to get all of disp_index of neighbor by max_score * neighbor threshold, is it? If not that mean I need to get all of index in scoremap? Thank you for your kindness. you help me a lot.
yes, all of 23x23 point, you just need some of them, like max_score * neighbor_threshold to get neighbor's pos.
Now I already implement new function, but maybe some missing or something wrong in my function.
Can you help me check my code? and I still don't get about scale_ind
variable.
def localize_target_neighbor(self, scores, sample_pos, sample_scales):
"""Run the target localization."""
scores = scores.squeeze(1)
preprocess_method = self.params.get('score_preprocess', 'none')
if preprocess_method == 'none':
pass
elif preprocess_method == 'exp':
scores = scores.exp()
elif preprocess_method == 'softmax':
reg_val = getattr(self.net.classifier.filter_optimizer, 'softmax_reg', None)
scores_view = scores.view(scores.shape[0], -1)
scores_softmax = activation.softmax_reg(scores_view, dim=-1, reg=reg_val)
scores = scores_softmax.view(scores.shape)
else:
raise Exception('Unknown score_preprocess in params.')
score_filter_ksz = self.params.get('score_filter_ksz', 1)
if score_filter_ksz > 1:
assert score_filter_ksz % 2 == 1
kernel = scores.new_ones(1,1,score_filter_ksz,score_filter_ksz)
scores = F.conv2d(scores.view(-1,1,*scores.shape[-2:]), kernel, padding=score_filter_ksz//2).view(scores.shape)
# if self.params.get('advanced_localization', False):
# return self.localize_advanced_neighbor(scores, sample_pos, sample_scales)
# Get neighbors
score_sz = torch.Tensor(list(scores.shape[-2:])) # (23,23)
score_center = (score_sz - 1)/2 # (11,11)
max_score, max_disp = dcf.max2d(scores)
_, scale_ind = torch.max(max_score, dim=0) # <-Don't know how to deal with
c_score_map = scores
mask = c_score_map.ge(0.7 * max_score) # neighbor threshold
values = torch.masked_select(c_score_map, mask)
indexes = torch.nonzero(mask.squeeze())
n_score = values
# Compute translation vector and scale change factor
output_sz = score_sz - (self.kernel_size + 1) % 2
translation_vec_neighbors = torch.tensor([]).cpu()
for i, index in enumerate(indexes):
index = index.clone().unsqueeze(0)
index = index.float().cpu().view(-1)
target_disp = index - score_center
translation_vec = target_disp.squeeze(0) * (self.img_support_sz / output_sz) * sample_scales
translation_vec_neighbors = torch.cat((translation_vec_neighbors, translation_vec), 0)
return translation_vec_neighbors, scale_ind, scores, None
And this is my track_neighbor
I edit around L119
# Compute classification scores
scores_raw = self.classify_target(test_x)
# Localize the target
translation_vec, scale_ind, s, flag = self.localize_target(scores_raw, sample_pos, sample_scales)
new_pos = sample_pos[scale_ind,:] + translation_vec
print(new_pos)
# NeighborTrack
translation_vec_neighbors, scale_ind_neighbor, s_negihbor, flag = self.localize_target_neighbor(scores_raw, sample_pos, sample_scales)
for translation_vec_neighbor in translation_vec_neighbors:
new_pos_neighbor = sample_pos[scale_ind_neighbor:,:] + translation_vec_neighbor
print(new_pos_neighbor)
Please help.
can you print max_score and max_disp? i didnt check it type and value, but i think it's kind of : max_score maybe like 23x23 value max_disp = 23x23x2 ( x index, y index)
if cannot understand max_score, maxdisp = dcf.max2d(scores) , scale_ind = torch.max(max_score, dim=0) # <-Don't know how to deal with maybe you can print it and try to copy its pattern.
i see this code maybe want to find a point of grid have max score and use it to define new bbox position and size
code of dcf.max2d are in https://github.com/franktpmvu/NeighborTrack/blob/c889695427a2288b42e31cd0f9e0f7e509244729/trackers/ostrack/pytracking/libs/dcf.py#L156
So I print out the result of max_score and max_disp, the result is show like this.
max_score: tensor([0.7967], device='cuda:0') torch.Size([1])
max_disp: tensor([[13, 14]], device='cuda:0') torch.Size([1, 2])
max_disp is kind of x,y position in 23x23 but I'm not sure about how to use scale_ind
oh, scale_ind is try to find a bigger one of max score, if max score is global maxima value, scale_ind are ignore, if we have a lot of local minima score (for example value name is scale_index, maybe some model like yolov3 have more than one scale output layer, e.g. 3 scale layer), them scale_ind will choose global maxima of them.
So I just ignore it right? If Yes, I alread got n_scores and its new pos of neighbor. so what to do next?
From your new versioon implementation it seem like the return state of track_neighbor require xyhw, neighbor_xyhw, score, n_score
, but the new_pos
of DIMP is just x, y position not the bbox of it. Now, I can't unpack to your requirement.
Sorry for bother you, but I'm newbie. Thank you for your help
try use https://github.com/franktpmvu/NeighborTrack/blob/c889695427a2288b42e31cd0f9e0f7e509244729/trackers/ostrack/pytracking/tracker/dimp/dimp.py#L144 to get bbox, you have all needed value now. i didnt know update codes change new_pos to other domain or not. so please check when input self.pos eqal to input new_pos, if not, it will be more difficult.
I check self.pos and new_pos is not equal. How to deal with the problem? This is my code In neighbor_track()
# Localize the target
translation_vec, scale_ind, s, flag = self.localize_target(scores_raw, sample_pos, sample_scales)
new_pos = sample_pos[scale_ind,:] + translation_vec
xywh = self.get_iounet_box(new_pos, self.target_sz, sample_pos[scale_ind,:], sample_scales[scale_ind])
print(new_pos)
print(self.pos)
# NeighborTrack
translation_vec_neighbors, n_score, s_negihbor, flag = self.localize_target_neighbor(scores_raw, sample_pos, sample_scales)
new_pos_neighbors = []
for translation_vec_neighbor in translation_vec_neighbors:
new_pos_neighbor = sample_pos[scale_ind:,:] + translation_vec_neighbor
new_pos_neighbor = new_pos_neighbor.squeeze(0)
xywh_n = self.get_iounet_box(new_pos_neighbor, self.target_sz, sample_pos[scale_ind,:], sample_scales[scale_ind])
print(xywh_n)
new_pos_neighbors.append(xywh_n)
return xywh, 1, new_pos_neighbors, n_score
in def localize_target_neighbor(self, scores, sample_pos, sample_scales):
"""Run the target localization."""
scores = scores.squeeze(1)
preprocess_method = self.params.get('score_preprocess', 'none')
if preprocess_method == 'none':
pass
elif preprocess_method == 'exp':
scores = scores.exp()
elif preprocess_method == 'softmax':
reg_val = getattr(self.net.classifier.filter_optimizer, 'softmax_reg', None)
scores_view = scores.view(scores.shape[0], -1)
scores_softmax = activation.softmax_reg(scores_view, dim=-1, reg=reg_val)
scores = scores_softmax.view(scores.shape)
else:
raise Exception('Unknown score_preprocess in params.')
score_filter_ksz = self.params.get('score_filter_ksz', 1)
if score_filter_ksz > 1:
assert score_filter_ksz % 2 == 1
kernel = scores.new_ones(1,1,score_filter_ksz,score_filter_ksz)
scores = F.conv2d(scores.view(-1,1,*scores.shape[-2:]), kernel, padding=score_filter_ksz//2).view(scores.shape)
# if self.params.get('advanced_localization', False):
# return self.localize_advanced_neighbor(scores, sample_pos, sample_scales)
# Get neighbors
score_sz = torch.Tensor(list(scores.shape[-2:])) # (23,23)
score_center = (score_sz - 1)/2 # (11,11)
max_score, max_disp = dcf.max2d(scores)
_, scale_ind = torch.max(max_score, dim=0) # <-Don't know how to deal with
c_score_map = scores
mask = c_score_map.ge(0.7 * max_score) # neighbor threshold
values = torch.masked_select(c_score_map, mask)
indexes = torch.nonzero(mask.squeeze())
n_score = values
# Compute translation vector and scale change factor
output_sz = score_sz - (self.kernel_size + 1) % 2
output = []
for i, index in enumerate(indexes):
index = index.clone().unsqueeze(0)
index = index.float().cpu().view(-1)
target_disp = index - score_center
translation_vec = target_disp.squeeze(0) * (self.img_support_sz / output_sz) * sample_scales
output.append(translation_vec)
translation_vec_neighbors = torch.stack(output, 0)
return translation_vec_neighbors, n_score, c_score_map, None
we know new_pos finally using by function update_state(): so we find it, check how to translate new_pos to self.pos we see this function update something and output self.pos, so you need follow this function step by step but don't update any self.xxx (because you not sure what answer will be choose by NeighborTrack) you can use a variable to simulate L489 L490 L495 but don't really update it
Ok, I will try to follow the step but just another one question Where I need to put the return state for your neighbor_track function
It before update comment right? or after this line https://github.com/franktpmvu/NeighborTrack/blob/c889695427a2288b42e31cd0f9e0f7e509244729/trackers/ostrack/pytracking/tracker/dimp/dimp.py#L167
because you use type ACB , so NeighborTrack just useing to get new_pos, you need to reverse the NeighborTrack answer to type of new_pos and replace new_pos (L122)
Thank but If I want to switch to ABC, I get use that 167L right?
yes, its kind of you are not last guy of office so youdidnt need to close the light. update step will be finish by original code.
Thank you for your help. Now I can run with neighbortrack success in DIMP. I got a one question from your ocean code. where the code that you get the neighbor position?
you can see pscore>threshold*np.max(pscore), and position, size get by L259 to L274
each project have different way to get position and size, we just follow it.
Can I implement the original DIMP with NeighborTrack?