Training Label Calculation

codeastra2 commented 1 year ago

Hi, Thanks so much for your work ! I had few understanding question regarding the training code since i am trying to re-implement it.

(ROI,H,S∗,I∗,V∗,P∗) would it be single training sample, I could generate the the first 4 using the existing agent code for cropping and label calculation. however, the last 2 I am ham having issues.
V∗ stands for valid ground truth co-ordinates in the next time step, and for this you propose graph simplification(remove deg 2 vertices) and then move the agent t ot t1, but how is the angle calculated in this case? How do we know the the angle in this case? Is this like we are on the vertex v and go over its edges and based on if it is intersection or segment we add a point along the direction of this edge?
P∗ stands for ground truth probability of vertices, since in the data generation phase we are only moving along the ground truth network, shouldn't it always be 1?
Also as regards the loss function for matching the vertices is this done with M=10 vertices or is it calculated once filtering out these vertices?

In general if you could describe the generation of the above 2 labels it would be of great help to me, thanks!

TonyXuQAQ commented 1 year ago

Hi,

Thanks for your question. My general suggestion is to wait for the training code of our work, which is expected to release two or three months later since the paper is still under review.

You could refer to section III.E and Fig. 7 of paper RNGDet for detailed rules for label generation. We do not consider angles, but use the ground truth road network to generate $V^*$.
Yes. Generated ground truth vertices have $p_i$ as 1. In each step, we predict 10 vertices in the next step, and we match the prediction with ground truth vertices in the next step (refer to DETR for loss calculation). Those predicted vertices matched with some GT vertices are treated as valid, while others are invalid.
We use all 10 vertices for loss calculation during training. And filter out vertices with low valid probability during inference.

We will open-source the training code soon. You can find the answers to your question there. Please be patient and thanks for your understanding.

codeastra2 commented 1 year ago

Thanks a lot for your reply!

I am actually doing my master thesis at DLR Germany and I would like to build upon your great work, hence I cannot wait for 2/3 months. It would be great if you could continue to answer my queries.

For generation of V* you mention that initial candidate vertices are used as well, however isn't the network also outputting these as well? So we should not consider these for ground truth generation right? In this case a a graph traversal like BFS as you have mentioned should suffice for generating all vertices I hope?
For all the ground truth vertices generated above, the probability will be one, but shouldn't we also have samples in the training set where the probability is different from 1? Among the 10 vertices generated by the network some of them will not match with the ground truth graph hence its probability will be 0 will these themselves act as the '0' label training samples?
in the sample generation is the agent used to move around the road and generate the crops and vertices or is a graph traversal algorithm BFS only used?
During sample generation how do we know that we have to switch from segment mode to intersection mode? Lets say from the current vertex the next vertex which is an intersection is at a distance < t , do we use this and detect the end of road? Thanks so much for your help, it would be useful my work in my thesis.

TonyXuQAQ commented 1 year ago

Thanks for your questions.

Correct. You do not need to consider initial candidate vertices during the generation of $V^*$. BFS should suffice.
Correct. For example, at one step, there are two ground truth vertices ($V^*={v_1,v_2}$) in the next step, then we will match 10 predicted vertices with those 2 ground truth vertices by the Hungarian algorithm. Then, 2 predicted vertices matched with ground truth vertices are treated as valid vertices (i.e., $y=1$), while the other 8 unmatched predicted vertices are invalid (i.e., $y=0$).
BFS is used. The agent is not utilized during the sampling period for better efficiency.
If the distance between the current vertex and some intersection points is smaller than a threshold, we move to this intersection point, and switch to intersection mode. Thanks

TonyXuQAQ commented 1 year ago

The sampling and training codes are released. Please refer to the implementation code for details. This issue is closed.

TonyXuQAQ / RNGDetPlusPlus

Training Label Calculation #14