microsoft / webnn-developer-preview

MIT License
45 stars 10 forks source link

Segment Anything: Clicking left mouse button causes SAM decoder model to be recreated/reloaded #42

Closed indygit closed 1 month ago

indygit commented 1 month ago

Opening SAM demo, click left mouse button on the image, the following message shows in the log window:

[09:24:55] [Load] ONNX Runtime Execution Provider: webnn [09:24:55] [Load] ONNX Runtime EP device type: gpu [09:24:55] [Load] Loading SAM ViT-B Decoder (FP16) · 15.7MB [09:24:55] [Load] SAM ViT-B Decoder (FP16) load time: 31.90ms [09:24:55] [Session Create] Creating SAM ViT-B Decoder (FP16) [09:24:55] [Session Create] SAM ViT-B Decoder (FP16) create time: 517.30ms

Is reloading the decoder model necessary?

Also once the decoder is reloaded, segmentation doesn't work as well as when the SAM page is initially loaded. (I am using a Windows 11 PC with NVidia card).

Should we disable the decoder reloading?

======================================================================

I know we can hover the mouse to trigger the segmentation inference and clicking the left mouse button saves the segmented image (very useful feature)

fdwr commented 1 month ago

Opening SAM demo, click left mouse button on the image ... Is reloading the decoder model necessary?

@ibelem - Is this behavior intentional? 🤔 https://microsoft.github.io/webnn-developer-preview/demos/segment-anything/

ibelem commented 1 month ago

Screenshot 2024-10-09 095746

@indygit @fdwr This is known and intentional behavior.

This demo supports multiple point coordinates segmentation, we need to update num_points and re-create ort session for WebNN provider as WebNN doesn't support the dynamic shape model. CC @Honry

Please refer to https://github.com/microsoft/webnn-developer-preview/blob/main/demos/segment-anything/index.js#L223C12-L224C60

indygit commented 1 month ago

Thanks Belem. That is good learning for me.