google-research-datasets / Objectron

Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Other
2.22k stars 264 forks source link

Develop 3D Object Detection for (delivery) boxes #43

Open shero1111 opened 3 years ago

shero1111 commented 3 years ago

Hello,

I need to develop an solution to detect and track some kind of boxes like delivery boxes etc.

How to start here? I read that there is already data in the dataset, but How to build a solution up on that to 3D detect Boxes?

Example: image

I really appreciate your help.

Thank you.

ahmadyan commented 3 years ago

To start, you'll need annotated data. There are annotated cereal boxes in the dataset, but you'll need to collect your own data and annotate it, then you can train the models for this purpose.

shero1111 commented 3 years ago

Have I to develop my own model for that or can I use an existing model for this?

On which site in the documentation or the MediaPipe sited have I to start from to develop an model?

When I for example look at this site: https://google.github.io/mediapipe/solutions/objectron.html I see 4 different Objectrons...for shoes, cameras etc...I want to develop an objectron for boxes. How to create a model (where to start to do this if nessesary)?

Thank you very mich in advanced!

ahmadyan commented 3 years ago

We haven't released the training code for the models yet, so you have to implement your own model.

shero1111 commented 3 years ago

I understand,

Could you tell me from where to start to create a model?

Thank you very much in advanced.

ahmadyan commented 3 years ago

A good starting point would be Tensorflow tutorials, next you can look at the source code of relevant models on Github. https://paperswithcode.com/task/6d-pose-estimation

jianingwei commented 3 years ago

You can also refer to Sec 5.2 of our paper https://arxiv.org/pdf/2012.09988.pdf for our models.

On Wed, Jun 2, 2021 at 2:44 PM Adel Ahmadyan @.***> wrote:

A good starting point would be Tensorflow tutorials https://www.tensorflow.org/tutorials, next you can look at the source code of relevant models on Github. https://paperswithcode.com/task/6d-pose-estimation

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google-research-datasets/Objectron/issues/43#issuecomment-853403830, or unsubscribe https://github.com/notifications/unsubscribe-auth/APYUXUB2EC66L2AJMSFRX6TTQ2Q33ANCNFSM453LMKLQ .

shero1111 commented 3 years ago

To start, you'll need annotated data. There are annotated cereal boxes in the dataset, but you'll need to collect your own data and annotate it, then you can train the models for this purpose.

How to annotate the data (videos) for the objectron? Unfortunately I read that the annotation tool is not released so that we could use it...

Any idea?

FPerezHernandez92 commented 2 years ago

I have the same question. I would like to train a model with a new object but I don't know how to proceed, what should I do? Thank you very much for everything.

xiezhangxiang commented 2 years ago

hello, I have the same question, did you solve it? I use the following code and get weird results, I don't know how to get the 2D keypoints, `image = cv2.imread(img_path) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) image = cv2.resize(image, (480, 640)) image = image / 255. images = [_normalize_image(image)]

images = np.asarray(images) model=load_model(model_filename, custom_objects={'loss': loss}) preds = model.predict(images)

print(preds) [array([[[[9.6562573e-05], [1.3071414e-05], [4.1917715e-06], ..., [9.8206790e-12], [9.0398376e-11], [7.6584143e-07]],

[[2.6079666e-05],
 [3.9501720e-06],
 [1.8212555e-10],
 ...,
 [2.8078556e-15],
 [1.5476401e-19],
 [8.4229308e-09]],

[[2.4633331e-05],
 [9.5524400e-10],
 [1.3348876e-11],
 ...,
 [1.4640141e-18],
 [1.6366661e-21],
 [3.6391362e-11]],

...,

[[1.3883292e-06],
 [1.1347703e-08],
 [6.0462207e-10],
 ...,
 [2.4876311e-08],
 [3.1831805e-11],
 [9.7017306e-08]],

[[4.0202780e-05],
 [5.1800769e-10],
 [1.1579547e-09],
 ...,
 [9.0847864e-13],
 [9.1428205e-11],
 [3.4593342e-10]],

[[1.0546885e-04],
 [1.2635512e-05],
 [1.5344558e-06],
 ...,
 [8.4130400e-09],
 [2.6183332e-11],
 [2.0492683e-09]]]], dtype=float32), array([[[[-0.10607169,  0.36145103, -0.11862978, ...,  0.03101592,
   0.30299583,  0.00596629],
 [-0.18905693,  0.46546325, -0.26854637, ..., -0.06751684,
   0.42021263, -0.18807213],
 [-0.21941239,  0.41301575, -0.26544824, ...,  0.04859204,
   0.40403038, -0.09107076],
 ...,
 [-0.17547359,  0.3736801 , -0.04492063, ...,  0.06182917,
  -0.21378447, -0.03202537],
 [-0.26361176,  0.36289865, -0.18332383, ...,  0.16499005,
  -0.09499758, -0.12895563],
 [-0.24102461,  0.25801325, -0.17738084, ...,  0.11746432,
  -0.16958712,  0.13721858]],

[[-0.21957912,  0.32535398, -0.23164174, ..., -0.2085964 ,
   0.43684924, -0.27276033],
 [-0.15121302,  0.3573573 , -0.20246796, ..., -0.10501267,
   0.5066237 , -0.11706068],
 [-0.17524916,  0.3559658 , -0.18497112, ..., -0.1335241 ,
   0.53169703, -0.18370274],
 ...,
 [-0.26286513,  0.30809528, -0.1212045 , ..., -0.08777827,
  -0.13896506, -0.17987725],
 [-0.25899106,  0.33262596, -0.08751082, ..., -0.02343384,
  -0.3164396 , -0.18116182],
 [-0.22164974,  0.23702136, -0.20336536, ..., -0.06228844,
  -0.18289375, -0.30683076]],

[[-0.16058055,  0.32249534, -0.17511356, ..., -0.13031082,
   0.4542202 , -0.22487643],
 [-0.15311602,  0.3490243 , -0.17877994, ..., -0.11121193,
   0.50228304, -0.17089653],
 [-0.20514728,  0.3469826 , -0.18969603, ..., -0.11347326,
   0.5460528 , -0.16435972],
 ...,
 [-0.36025456,  0.4073612 , -0.01529002, ...,  0.24054597,
  -0.38046253,  0.14016253],
 [-0.37262747,  0.4091622 , -0.10438414, ...,  0.36949152,
   0.19607303,  0.03621448],
 [-0.28537005,  0.24178793, -0.12843539, ...,  0.11386134,
  -0.38351035,  0.27503756]],

...,

[[-0.08681132, -0.05887846, -0.01539195, ..., -0.36459795,
   0.5349943 , -0.25741568],
 [-0.04578761, -0.05969733, -0.00410217, ..., -0.41354814,
   0.6133671 , -0.2914826 ],
 [-0.06978828, -0.0289972 ,  0.01747608, ..., -0.423895  ,
   0.5479816 , -0.32753658],
 ...,
 [-0.2598699 ,  0.20992802, -0.04680583, ..., -0.43057957,
   0.15357617, -0.53516096],
 [-0.33677104,  0.20362546, -0.09578266, ..., -0.4407214 ,
   0.04547567, -0.5529746 ],
 [-0.4277043 ,  0.19496255, -0.18552476, ..., -0.42837453,
   0.01995449, -0.4375854 ]],

[[-0.03453992, -0.05292309,  0.00213689, ..., -0.50154454,
   0.6197945 , -0.39903948],
 [-0.03441546, -0.08145237, -0.04914407, ..., -0.4739752 ,
   0.5260091 , -0.33690655],
 [-0.04759429, -0.08588249, -0.04430763, ..., -0.46352687,
   0.53554165, -0.31229335],
 ...,
 [-0.3086209 ,  0.15528192, -0.14666194, ..., -0.46730536,
   0.13626733, -0.5117987 ],
 [-0.37810522,  0.17945792, -0.2264315 , ..., -0.44889984,
   0.17014027, -0.4020097 ],
 [-0.48893178,  0.22216477, -0.34320357, ..., -0.57811224,
  -0.18882565, -0.39809525]],

[[-0.07705554, -0.21781273,  0.0330582 , ..., -0.38549614,
   0.6696893 , -0.17962183],
 [-0.04036303, -0.19197614, -0.05262863, ..., -0.43213007,
   0.46479934, -0.32706207],
 [-0.09982854, -0.22474429, -0.06387011, ..., -0.39725167,
   0.3695163 , -0.24147348],
 ...,
 [-0.2948659 ,  0.10649519, -0.16847448, ..., -0.4088996 ,
   0.07583192, -0.3535105 ],
 [-0.33526367,  0.16336042, -0.26918498, ..., -0.6608317 ,
  -0.21164288, -0.4696032 ],
 [-0.5637162 ,  0.04995263, -0.39664903, ..., -0.57493746,
   0.04123268, -0.45364913]]]], dtype=float32)]

(1, 160, 120, 1) (1, 160, 120, 16)` how can i get the keypoints?

HripsimeS commented 1 year ago

We haven't released the training code for the models yet, so you have to implement your own model.

@ahmadyan Hello. The training code for the models is released already since June 2021 ?

XinyueZ commented 10 months ago

same question? where is code for training?

tranhogdiep commented 8 months ago

Sorry, but any news on the training code?