axinc-ai / ailia-models

The collection of pre-trained, state-of-the-art AI models for ailia SDK
2k stars 318 forks source link

Implement Grounded-SAM #1507

Closed ooe1123 closed 2 months ago

ooe1123 commented 2 months ago

https://github.com/axinc-ai/ailia-models/issues/1493

kyakuno commented 2 months ago

モデルアップロード済み。 https://storage.googleapis.com/ailia-models/grounded-sam/sam_vit_h_4b8939.onnx

kyakuno commented 2 months ago

@ooe1123 モデル追加、ありがとうございます!トップのREADME.mdと、scripts/download_all_models.shにも追加いただけると嬉しいです。

kyakuno commented 2 months ago

あと、SAMのpbのダウンロードコードの追加が必要そうです。

kyakuno commented 2 months ago

ailia SDKだと下記のエラー。

  File "/Users/kyakuno/Desktop/ailia/ailia-models-ax/image_segmentation/grounded_sam/grounded_sam.py", line 285, in segment
    output = net.predict([img, box, input_size, original_size])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/ailia/wrapper.py", line 400, in predict
    self.update()
  File "/usr/local/lib/python3.11/site-packages/ailia/wrapper.py", line 716, in update
    core.check_error(code, self.__net)
  File "/usr/local/lib/python3.11/site-packages/ailia/core.py", line 909, in check_error
    raise e(detail)
ailia.core.AiliaInvalidLayerException: code: -10 (Incorrect layer parameter. [broken or unsupported AI model file])
+ error detail : Layer:/bert/embeddings/Add_output_0(Eltwise) Error:Unacceptable broadcast size pair [output:(1,1,4,768) (stride:(3072,3072,768,1)) vs input /bert/embeddings/token_type_embeddings/Gather_output_0:(2,768) (stride:(768,1))].
kyakuno commented 2 months ago

下記のnetの引数が間違っている。

        grounding_dino = ailia.Net(MODEL_GDINO_PATH, WEIGHT_GDINO_PATH, env_id=env_id)
        net = ailia.Net(MODEL_GDINO_PATH, WEIGHT_GDINO_PATH, env_id=env_id)
kyakuno commented 2 months ago

省メモリモードを適用。メモリ使用量は5GB程度。