cfzd / Ultra-Fast-Lane-Detection

Ultra Fast Structure-aware Deep Lane Detection (ECCV 2020)
MIT License
1.76k stars 492 forks source link

Probability values in the output tensor #311

Open nithinme3 opened 1 year ago

nithinme3 commented 1 year ago

Hi team, Appreciate your amazing work. I understand the output is in the form grid_numbers X row_anchors X no_of lanes. So my doubt is, what would be the probability values when there is no lane? For example in a lane, in a row_anchor we have 101 probability values. So can I directly take the maximum of those probabilities to determine where is the lane (in which grid basically) in that row anchor?

suppose if we have no lane in that 100 probabilities then if I am taking max probability, then would that give a wrong position?

Basically I want to implement the post processing in c++ without torch support.

thank you, Nithin

cfzd commented 1 year ago

@nithinme3 Hi, The output is (grid_numbers+1) x row_anchors x no_of_lanes. For example, if we have 101 probability values, the last one is defined as the probability of no lane. In this way, the post-processing should look like this:

  1. softmax the 101-dim output and get all probability values
  2. check if the last dim (no lane dim) is the biggest. If yes, directly output no lane.
  3. If no in step 2, use the mathematical expectation to get the coordinate

You can refer to this part: https://github.com/cfzd/Ultra-Fast-Lane-Detection/blob/784221a5da9e40cb47f5d83ebd6a6a202ce9d22c/evaluation/eval_wrapper.py#L9-L44

nithinme3 commented 1 year ago

Hi @cfzd, Thanks for the reply, I was following the demo.py file few doubts I have,

  1. for i in range(out_j.shape[1]): if np.sum(out_j[:, i] != 0) > 2: I did not get this line. why we are checking the sum of all nonzero elements of 56 values of a particular lane is greater than 2 or not?
cfzd commented 1 year ago

@nithinme3 It simply filters out lane lines that have less than 2 valid points.

nithinme3 commented 1 year ago

Hi @cfzd , Thanks for the reply..that was silly understanding mistake sorry!! I am still not getting the correct values as output.

Just to make it clear, the following line, loc = np.sum(prob * idx, axis=0)

What I have done is, I have created a vector of 100 softmaxed probabilities one specific row, column. I created another vector idx with elements (1,100). Then I multiplied corresponding terms of these two vectors and summed up the resultant vector to get the 'loc ' value.

is the procedure is correct?

thank you, Nithin

cfzd commented 1 year ago

@nithinme3 Yes, it is correct.

nithinme3 commented 1 year ago

Hi @cfzd ,

Thanks for the confirmation.

I have converted the model to onnx and trying to infer on nvidia drive agx xavier.

In xavier am getting output tensor as a single float array. so I have to process accordingly.

I have not done the flipping ( out_j = out_j[:, ::-1, :] ) instead I reversed the tusimple row anchor array.

So is there any other change/s I should make because of this?

Currently am getting the lane line points but when am overlaying on image its kind of shifted.

Thank you, Nithin

cfzd commented 1 year ago

@nithinme3 It is exactly the same for flipping the output and flipping the row anchor, so it is correct.

For the shifted problem, it might be because the row anchor is not correctly calculated(row anchor should match the resolution in your device), or the location of lanes is not correctly scaled (It should also match the resolution).

nithinme3 commented 1 year ago

Hi @cfzd,

Currently I have used the same tusimple dataset (1280720) for testing and tusimple pretrained model (288800) for inference. So the same row anchors specified in repository should work right?

cfzd commented 1 year ago

@nithinme3 Yes, it should work right. In the demo.py, there are codes that exactly match your case(Tusimple & 288x800 inference). You can refer to it.