bharath5673 / YOLOv8-3D

YOLOv8-3D is a LowCode, Simple 2D and 3D Bounding Box Object Detection and Tracking , Python 3.10
165 stars 32 forks source link

Is it possible to label the length, width and height of an object based on this 3d box? #10

Closed upper127 closed 8 months ago

upper127 commented 8 months ago

This 3d box is derived from the 2d box, so does this 3d box imply some real information about the size of the object

bharath5673 commented 8 months ago

yes, the 3D frame can indeed provide information about the size of objects in the scene..

On Wed, Mar 20, 2024 at 1:17 PM upper127 @.***> wrote:

This 3d frame is derived from the 2d frame, so does this 3d frame imply some real information about the size of the object

— Reply to this email directly, view it on GitHub https://github.com/bharath5673/YOLOv8-3D/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIBKZLMU7QPB4VOFEPZGNI3YZE5HJAVCNFSM6AAAAABE66NRGWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE4TMOBXGY3TCNI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

upper127 commented 8 months ago

If I want to display the dimensions such as length, width, etc. of an object in the code you posted, how do I make the change? Can you provide some help?

bharath5673 commented 8 months ago

Change ?? Like how?

On Wed, 20 Mar 2024, 6:29 pm upper127, @.***> wrote:

If I want to display the dimensions such as length, width, etc. of an object in the code you posted, how do I make the change? Can you provide some help?

— Reply to this email directly, view it on GitHub https://github.com/bharath5673/YOLOv8-3D/issues/10#issuecomment-2009512210, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIBKZLNBMLCRDJWS3RFS5PTYZGB3TAVCNFSM6AAAAABE66NRGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBZGUYTEMRRGA . You are receiving this because you commented.Message ID: @.***>

upper127 commented 8 months ago

Display the object's length, width, and height on the final recognition.

upper127 commented 8 months ago

image Also, why is this 3d box tilted for me

bharath5673 commented 8 months ago

Display the object's length, width, and height on the final recognition.

once u got all the coordinate pts, calculate euclidean dist between all each pts, establish a scale factor that maps pixel measurements to real-world measurements doing camera calibrations, apply recale factors if required, and then display..

bharath5673 commented 8 months ago

image Also, why is this 3d box tilted for me

model is failing to generalize..

upper127 commented 8 months ago

image另外,为什么这个 3d 盒子对我来说是倾斜的

模型无法泛化。. But I'm using the test data you provided. Can you provide more models and pre-weights, I don't seem to see all of them in github(resnet、efficientnet、vgg)

upper127 commented 8 months ago

Display the object's length, width, and height on the final recognition.

once u got all the coordinate pts, calculate euclidean dist between all each pts, establish a scale factor that maps pixel measurements to real-world measurements doing camera calibrations, apply recale factors if required, and then display.. Since I'm adding functionality based on the code you posted, in the original code I seem to see the ability to convert 2d vehicle images into 3d information code: dim = [prediction[0][0]]

Is it implied here that the model's estimate of the size of the object? Based on your answer, I don't quite understand this recale factors

bharath5673 commented 8 months ago

image另外,为什么这个 3d 盒子对我来说是倾斜的

模型无法泛化。. But I'm using the test data you provided. Can you provide more models and pre-weights, I don't seem to see all of them in github(resnet、efficientnet、vgg)

I'm sorry, but training the code to generate weights is a necessary step. There isn't an alternative method available at the moment. If you encounter any difficulties during the training process, feel free to ask for assistance.

bharath5673 commented 8 months ago

Display the object's length, width, and height on the final recognition.

once u got all the coordinate pts, calculate euclidean dist between all each pts, establish a scale factor that maps pixel measurements to real-world measurements doing camera calibrations, apply recale factors if required, and then display.. Since I'm adding functionality based on the code you posted, in the original code I seem to see the ability to convert 2d vehicle images into 3d information code: dim = [prediction[0][0]]

Is it implied here that the model's estimate of the size of the object? Based on your answer, I don't quite understand this recale factors

no,, its not.. // rescale is to use the established scale factor or apply scale factor to convert pixel measurements of objects in the image to real-world measurements., This involves multiplying the pixel measurements by the appropriate scale factor to obtain lengths, widths, and heights in real-world units..

upper127 commented 8 months ago

Do you mean that I should determine the proportionality between the pixel size and the actual size, and then multiply the pixel size predicted by the model by this proportionality to arrive at the size of the actual object?

bharath5673 commented 8 months ago

Exactly! You've got it..

On Sun, Mar 24, 2024 at 1:07 PM upper127 @.***> wrote:

Do you mean that I should determine the proportionality between the pixel size and the actual size, and then multiply the pixel size predicted by the model by this proportionality to arrive at the size of the actual object?

— Reply to this email directly, view it on GitHub https://github.com/bharath5673/YOLOv8-3D/issues/10#issuecomment-2016722623, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIBKZLKD3RIVIHG6WDLSJZTYZZ7C7AVCNFSM6AAAAABE66NRGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJWG4ZDENRSGM . You are receiving this because you modified the open/close state.Message ID: @.***>

upper127 commented 8 months ago

But now there is a problem: I want to implement the estimation of the real size on the code and raw data you provided first. According to the scenario we discussed above, it seems that there is no way to construct the ratio of the true size to the pixel size for the time being, unless I construct a specialized data myself

Exactly! You've got it..

AekSW commented 2 months ago

@upper127 How did you get it to output a dimension? Is there a way I could get some hint of the code about it?