google-research-datasets / Objectron

Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Other
2.23k stars 263 forks source link

How to get the trained model for 'book' and 'cereal_box' category through mediapipe python API? #53

Open DeriZSY opened 2 years ago

DeriZSY commented 2 years ago

Hi, I'm a researcher working on a paper related to Object 6D Pose estimation. The proposed method in Objectron is an important baseline method for us so we do hope to compare our method with the proposed method in Objectron on our own dataset.

However, the models for 'book' and 'cereal_box' are not available on mediapipe python API. Is there any method for us to obtain the model for these two categories?

ahmadyan commented 2 years ago

you can download the models from the objectron bucket on gcs, at the objectron/models example (if you have gsutil, requires authentication): gsutil ls gs://objectron/model gsutil cp -r gs://objectron/model local_dataset_dir

or directly via http: https://storage.googleapis.com/objectron/models/objectron_mesh2_cvpr/book.hdf5, etc.

DeriZSY commented 2 years ago

you can download the models from the objectron bucket on gcs, at the objectron/models example (if you have gsutil, requires authentication): gsutil ls gs://objectron/model gsutil cp -r gs://objectron/model local_dataset_dir

or directly via http: https://storage.googleapis.com/objectron/models/objectron_mesh2_cvpr/book.hdf5, etc.

thanks for the reply, and how should I load the model weights then? Are there any possible hacks for I to use it with mediapipe API?

xiezhangxiang commented 2 years ago

you can download the models from the objectron bucket on gcs, at the objectron/models example (if you have gsutil, requires authentication): gsutil ls gs://objectron/model gsutil cp -r gs://objectron/model local_dataset_dir or directly via http: https://storage.googleapis.com/objectron/models/objectron_mesh2_cvpr/book.hdf5, etc.

thanks for the reply, and how should I load the model weights then? Are there any possible hacks for I to use it with mediapipe API?

hello, I have the same question, did you solve it? I use the following code and get weird results, I don't know how to get the 2D keypoints, `image = cv2.imread(img_path) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) image = cv2.resize(image, (480, 640)) image = image / 255. images = [_normalize_image(image)]

images = np.asarray(images) model=load_model(model_filename, custom_objects={'loss': loss}) preds = model.predict(images)

print(preds) [array([[[[9.6562573e-05], [1.3071414e-05], [4.1917715e-06], ..., [9.8206790e-12], [9.0398376e-11], [7.6584143e-07]],

    [[2.6079666e-05],
     [3.9501720e-06],
     [1.8212555e-10],
     ...,
     [2.8078556e-15],
     [1.5476401e-19],
     [8.4229308e-09]],

    [[2.4633331e-05],
     [9.5524400e-10],
     [1.3348876e-11],
     ...,
     [1.4640141e-18],
     [1.6366661e-21],
     [3.6391362e-11]],

    ...,

    [[1.3883292e-06],
     [1.1347703e-08],
     [6.0462207e-10],
     ...,
     [2.4876311e-08],
     [3.1831805e-11],
     [9.7017306e-08]],

    [[4.0202780e-05],
     [5.1800769e-10],
     [1.1579547e-09],
     ...,
     [9.0847864e-13],
     [9.1428205e-11],
     [3.4593342e-10]],

    [[1.0546885e-04],
     [1.2635512e-05],
     [1.5344558e-06],
     ...,
     [8.4130400e-09],
     [2.6183332e-11],
     [2.0492683e-09]]]], dtype=float32), array([[[[-0.10607169,  0.36145103, -0.11862978, ...,  0.03101592,
       0.30299583,  0.00596629],
     [-0.18905693,  0.46546325, -0.26854637, ..., -0.06751684,
       0.42021263, -0.18807213],
     [-0.21941239,  0.41301575, -0.26544824, ...,  0.04859204,
       0.40403038, -0.09107076],
     ...,
     [-0.17547359,  0.3736801 , -0.04492063, ...,  0.06182917,
      -0.21378447, -0.03202537],
     [-0.26361176,  0.36289865, -0.18332383, ...,  0.16499005,
      -0.09499758, -0.12895563],
     [-0.24102461,  0.25801325, -0.17738084, ...,  0.11746432,
      -0.16958712,  0.13721858]],

    [[-0.21957912,  0.32535398, -0.23164174, ..., -0.2085964 ,
       0.43684924, -0.27276033],
     [-0.15121302,  0.3573573 , -0.20246796, ..., -0.10501267,
       0.5066237 , -0.11706068],
     [-0.17524916,  0.3559658 , -0.18497112, ..., -0.1335241 ,
       0.53169703, -0.18370274],
     ...,
     [-0.26286513,  0.30809528, -0.1212045 , ..., -0.08777827,
      -0.13896506, -0.17987725],
     [-0.25899106,  0.33262596, -0.08751082, ..., -0.02343384,
      -0.3164396 , -0.18116182],
     [-0.22164974,  0.23702136, -0.20336536, ..., -0.06228844,
      -0.18289375, -0.30683076]],

    [[-0.16058055,  0.32249534, -0.17511356, ..., -0.13031082,
       0.4542202 , -0.22487643],
     [-0.15311602,  0.3490243 , -0.17877994, ..., -0.11121193,
       0.50228304, -0.17089653],
     [-0.20514728,  0.3469826 , -0.18969603, ..., -0.11347326,
       0.5460528 , -0.16435972],
     ...,
     [-0.36025456,  0.4073612 , -0.01529002, ...,  0.24054597,
      -0.38046253,  0.14016253],
     [-0.37262747,  0.4091622 , -0.10438414, ...,  0.36949152,
       0.19607303,  0.03621448],
     [-0.28537005,  0.24178793, -0.12843539, ...,  0.11386134,
      -0.38351035,  0.27503756]],

    ...,

    [[-0.08681132, -0.05887846, -0.01539195, ..., -0.36459795,
       0.5349943 , -0.25741568],
     [-0.04578761, -0.05969733, -0.00410217, ..., -0.41354814,
       0.6133671 , -0.2914826 ],
     [-0.06978828, -0.0289972 ,  0.01747608, ..., -0.423895  ,
       0.5479816 , -0.32753658],
     ...,
     [-0.2598699 ,  0.20992802, -0.04680583, ..., -0.43057957,
       0.15357617, -0.53516096],
     [-0.33677104,  0.20362546, -0.09578266, ..., -0.4407214 ,
       0.04547567, -0.5529746 ],
     [-0.4277043 ,  0.19496255, -0.18552476, ..., -0.42837453,
       0.01995449, -0.4375854 ]],

    [[-0.03453992, -0.05292309,  0.00213689, ..., -0.50154454,
       0.6197945 , -0.39903948],
     [-0.03441546, -0.08145237, -0.04914407, ..., -0.4739752 ,
       0.5260091 , -0.33690655],
     [-0.04759429, -0.08588249, -0.04430763, ..., -0.46352687,
       0.53554165, -0.31229335],
     ...,
     [-0.3086209 ,  0.15528192, -0.14666194, ..., -0.46730536,
       0.13626733, -0.5117987 ],
     [-0.37810522,  0.17945792, -0.2264315 , ..., -0.44889984,
       0.17014027, -0.4020097 ],
     [-0.48893178,  0.22216477, -0.34320357, ..., -0.57811224,
      -0.18882565, -0.39809525]],

    [[-0.07705554, -0.21781273,  0.0330582 , ..., -0.38549614,
       0.6696893 , -0.17962183],
     [-0.04036303, -0.19197614, -0.05262863, ..., -0.43213007,
       0.46479934, -0.32706207],
     [-0.09982854, -0.22474429, -0.06387011, ..., -0.39725167,
       0.3695163 , -0.24147348],
     ...,
     [-0.2948659 ,  0.10649519, -0.16847448, ..., -0.4088996 ,
       0.07583192, -0.3535105 ],
     [-0.33526367,  0.16336042, -0.26918498, ..., -0.6608317 ,
      -0.21164288, -0.4696032 ],
     [-0.5637162 ,  0.04995263, -0.39664903, ..., -0.57493746,
       0.04123268, -0.45364913]]]], dtype=float32)]

(1, 160, 120, 1) (1, 160, 120, 16)` how can i get the keypoints?

ahmadyan commented 2 years ago

It looks like you are using the older version of our models (mobile-pose) where it predicts heatmaps (1 center heatmap and 16 heatmaps for x-y displacement of 8 keypoints). If you use the more recent version (mesh2) it should predict a 2x9 matrix at the output, which contains the x,y of each keypoints.

Example code: https://github.com/google-research-datasets/Objectron/blob/c06a65165a18396e1e00091981fd1652875c97b5/objectron/dataset/eval.py#L110

xiezhangxiang commented 2 years ago

It looks like you are using the older version of our models (mobile-pose) where it predicts heatmaps (1 center heatmap and 16 heatmaps for x-y displacement of 8 keypoints). If you use the more recent version (mesh2) it should predict a 2x9 matrix at the output, which contains the x,y of each keypoints.

Example code:

https://github.com/google-research-datasets/Objectron/blob/c06a65165a18396e1e00091981fd1652875c97b5/objectron/dataset/eval.py#L110

Thank you for your reply, I download the mesh2 version model https://storage.googleapis.com/objectron/models/objectron_mesh2_cvpr/cereal_box.hdf5 according to your reply and I meet the problem as following: ValueError: Unknown initializer: ConvolutionalInitializer. Please ensure this object is passed to thecustom_objects`argument. Can you provide a simple prediction example?thank you again!

pherrusa7 commented 2 years ago

Hi @ahmadyan, great work!

Once downloaded the .hdf5 files, when we try to load the weights there is a custom layer ConvolutionalInitializer that I believe we need to know in order to use them, could you please provide this and any other missing blocks?

from tensorflow import keras
model = keras.models.load_model('chair.hdf5')

ValueError: Unknown initializer: ConvolutionalInitializer. Please ensure this object is 
passed to the `custom_objects` argument. 
See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object 
for details.

If we are missing something or there is another way to read the weights please let us know, without this, we can't reproduce the insights of your paper :(

btw, the gsutil you provided above was missing an s. The working command is:

$ gsutil ls gs://objectron/models
gs://objectron/models/objectron_mesh2_cvpr_$folder$
gs://objectron/models/objectron_mobilepose_$folder$
gs://objectron/models/objectron_mesh2_cvpr/
gs://objectron/models/objectron_mobilepose/

Thank you in advance and again, thank you for this amazing work.

pherrusa7 commented 2 years ago

Hi @ahmadyan, great work!

Once downloaded the .hdf5 files, when we try to load the weights there is a custom layer ConvolutionalInitializer that I believe we need to know in order to use them, could you please provide this and any other missing blocks?

from tensorflow import keras
model = keras.models.load_model('chair.hdf5')

ValueError: Unknown initializer: ConvolutionalInitializer. Please ensure this object is 
passed to the `custom_objects` argument. 
See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object 
for details.

If we are missing something or there is another way to read the weights please let us know, without this, we can't reproduce the insights of your paper :(

btw, the gsutil you provided above was missing an s. The working command is:

$ gsutil ls gs://objectron/models
gs://objectron/models/objectron_mesh2_cvpr_$folder$
gs://objectron/models/objectron_mobilepose_$folder$
gs://objectron/models/objectron_mesh2_cvpr/
gs://objectron/models/objectron_mobilepose/

Thank you in advance and again, thank you for this amazing work.

Actually, best would be an example of loading such weights since after filling ConvolutionalInitializer with a random initializer now DropConnect is missing. Can you please upload an example loading any of the provided weights? Thank you!