In the paper, it's mentioned that "Perspective Transformer is used in coarse-level lane detection to extract the dense BEV features and detect coarse 3D lanes at the lowest resolution." But in the codes, the "projs = self.pers_tr(input, frontview_features, _M_inv)" , and the projs have the four different levels of resolution of BEV features, so the training parameters does not decrease a lot.
Maybe I misunderstood, Could you have some explanation?
In the paper, it's mentioned that "Perspective Transformer is used in coarse-level lane detection to extract the dense BEV features and detect coarse 3D lanes at the lowest resolution." But in the codes, the "projs = self.pers_tr(input, frontview_features, _M_inv)" , and the projs have the four different levels of resolution of BEV features, so the training parameters does not decrease a lot. Maybe I misunderstood, Could you have some explanation?