Can the author elaborate on the following lines from the Conclusions and limitations section of the article? "One limitation of our approach is the dependency on 3D joint location ground-truth, and in particular, the requirement that it is given at the axis system of the train cameras." As far as I understand, the dependency of the 3D joint location ground-truth is due to the learning based approach that optimizes the Loss Functions. Not sure if I understand the second limitation.
Can the author elaborate on the following lines from the Conclusions and limitations section of the article? "One limitation of our approach is the dependency on 3D joint location ground-truth, and in particular, the requirement that it is given at the axis system of the train cameras." As far as I understand, the dependency of the 3D joint location ground-truth is due to the learning based approach that optimizes the Loss Functions. Not sure if I understand the second limitation.