Open Saikumar-gadde517 opened 7 months ago
One thing to check is whether the JSON.parse(fileEmbeddings)
part is completing successfully (I'm not sure if the error is occurring during/after that). It's possible that the embedding wasn't saved in a way that is json-compatible, and therefore can't be loaded/parsed properly.
The other (related) suggestion would be to check the data type of the image embedding input. In the feed
variable, every entry is an ort.Tensor, except the embedding, which is some json-compatible type (since it comes from JSON.parse(...)
). It seems likely that the embedding needs to be formatted as a tensor as well, and that may be the (indirect) cause of the error message.
Hi @heyoeyo , I tried to do this in react native. I created embeddings in python runtime, then sent those embeddings to react native and ran the decoder with that embedding and I got an output mask as well. But when I displayed the mask using a python script, it looks very weird.
Original image:
Mask:
After I ran the decoder, I saved it to local as json text file and read the masks key and printed it out to get the above result. Am I missing something here? Do I have to do any further post processing to get a binary mask? Any help is appreciated.
Hi @CriusFission there may be a few things off here, but it's hard to say for sure.
The main thing that stands out as strange is the size of your mask, it looks to be something like 1024x700 pixels, whereas the input image is 480x640 (?). Following the SAM mask sizing is confusing because there are a bunch of steps, but I'll try to list it out...
One (or more) of these steps seems to have gone wrong here, because the mask isn't the right size and the padding is still visible. My best guess would be that some height/width values got swapped somewhere (step 3 most likely?), since the aspect ratio of the mask is flipped compared to the original input. If you're using the original SAM code, it's probably worth dropping a bunch of print(masks.shape)
statements throughout the postprocess_masks function to try to see what's going on.
Aside from that, the mask image looks like the raw output from the model. The original SAM model doesn't output a binary mask directly, instead this comes from the last processing steps, where the mask gets thresholded (pixels > 0). That step is definitely missing here, so that's something to try adding in.
One last concern is that the mask is very low-contrast, which means the segmentation mask (even after thresholding) wouldn't have been anything meaningful (maybe the top-left corner...?). That sort of suggests that the input prompt may not be formatted correctly (unless you just put in a (0,0) point for testing, that would make sense), since nothing seems to be selected. So it's probably worth double checking that the input prompts are formatted/scaled correctly (if there was a swapping of width/height somewhere, that might mean the prompt was swapped as well which could've placed it outside the image).
I have the encoder image_embeddings in a text file in my root project directory. When i try to read the text file with encoder embeddings, the react-native can able to read the file. But if i pass the data to the decoder model its not even reading the image_embeddings from the text file. Its returning this error: Cannot read property 'buffer' of undefined.
Here is my code please check it.