facebookresearch / segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Apache License 2.0
46.58k stars 5.52k forks source link

Satellite Imagery(tiny objects) Generalization #51

Open Radhika-Keni opened 1 year ago

Radhika-Keni commented 1 year ago

Thank you for the incredible work & congratulations!

SAM does not seem to generalize as well on satellite imagery(tiny objects). This was the result of the "segment everything" option on the image . However SAM works better on the same image if I manually prompt the model to the object of interest (such as the tiny aircrafts on the LHS corner) that it may have missed in the "segment everything" option. Satellite Imagery

Couple of more examples with the "segment everything" option:

image

image

Any insights on the same would be most helpful !

AhmedHefnawy commented 1 year ago

yes, u r right.. I think so and Interested to know how can we transfer learning in our custom satellite dataset... it will do well I think, if you have any idea about transfer learning, tell me

zhilonglu commented 1 year ago

the same issue occurs to me. And I find the demo can not support tif format, when i try to upload the RS image and choose Everthing to cut out all objects, there are many objects can not be detected actually.

Radhika-Keni commented 1 year ago

yes, u r right.. I think so and Interested to know how can we transfer learning in our custom satellite dataset... it will do well I think, if you have any idea about transfer learning, tell me

sure , I'll let you know if I manage fine tune SAM on this particular aerial Imagery dataset(xview1) that I'v used and share the results. However , it would be great if team SAM could share insights on whether this is a general issue on small/tiny objects and SAM needs fine tuning on such use-cases !

kretes commented 1 year ago

After reading the paper I get the impression that this is one of the limitations.

  1. Limitations: "SAM can miss fine structures"
  2. when preparing the data to train on - masks smaller than 100 pixels were removed (Appendix B. Postprocessing)

It's a promptable model, and a 'segment everything' probably works on a generated grid of prompt points - if the point misses the part of a fine structure (like a plane) - it will be missed.

What you might try is to do something similar as authors did when preparing the dataset: Run the model multiple times on on overlapping, zoomed-in regions. The mask boundaries won't be that crisp, but you have higher chances of segmenting fine structures, as they will appear bigger on zoomed-in image. That's a computationally-heavy approach, as you need to run SAM from the beginning on every zoomed-in patch.

Another approach would be to generate prompt points not from the grid, but somehow informed by the image - let's say using a sensitive edge detector - one can sample prompt points s.t. each connected component has at least one point sampled from it.

Radhika-Keni commented 1 year ago

After reading the paper I get the impression that this is one of the limitations.

  1. Limitations: "SAM can miss fine structures"
  2. when preparing the data to train on - masks smaller than 100 pixels were removed (Appendix B. Postprocessing)

It's a promptable model, and a 'segment everything' probably works on a generated grid of prompt points - if the point misses the part of a fine structure (like a plane) - it will be missed.

What you might try is to do something similar as authors did when preparing the dataset: Run the model multiple times on on overlapping, zoomed-in regions. The mask boundaries won't be that crisp, but you have higher chances of segmenting fine structures, as they will appear bigger on zoomed-in image. That's a computationally-heavy approach, as you need to run SAM from the beginning on every zoomed-in patch.

Another approach would be to generate prompt points not from the grid, but somehow informed by the image - let's say using a sensitive edge detector - one can sample prompt points s.t. each connected component has at least one point sampled from it.

@kretes : Thank you , this was very insightful! @kretes : A follow up thought based on the insights you shared :: Wondering if simply adding more grid points to the 'pre-set/fixed/constant' grid for images with fine objects/tiny objects would help alleviate the problem by reducing the probability of missing a 'tiny object' ? It could be parameter that can be changed at inference time based on whether its a regular image or an image that contains tiny objects , to avoid becoming an overkill for 'regular' images with regular object sizes. So a denser(but pre/set/fixed) grid for all images with tiny objects whose dimensions remains constant across all images with tiny objects so no specific changes are needed wrt each image.

Image: The image below is a screenshot of the first step that the ' segment everything' option does which is setting the grid (as you've @kretes mentioned earlier in your post) from their website. Based on a visual inspection , there seems to be scope to make this grid denser !

image

Bencpr commented 1 year ago

Hi @Radhika-Keni , thanks for your post. I'm looking for the "segment everything" option that you mentioned, and can't find it.

Radhika-Keni commented 1 year ago

Hope this helps image

MyDatasheet commented 1 year ago

Hi. I think the SAM can not segment satellite images precisely.

Geoyi commented 1 year ago

Would love to know the fine-tuning workflow too. @MyDatasheet you'd be surprised, the results are quite impressive but there are some edge cases that would benefit from fine-tuning.

Havi-muro commented 1 year ago

The resulting segments are a list of dictionaries. Does anyone knows how to transform that into a georeferenced image?

binol13 commented 1 year ago

The resulting segments are a list of dictionaries. Does anyone knows how to transform that into a georeferenced image?

The following repo might be helpful: https://github.com/aliaksandr960/segment-anything-eo/blob/main/README.md

giswqs commented 1 year ago

I just released the segment-geospatial Python package, making it easier to segment satellite imagery and export results in various vector format.

https://user-images.githubusercontent.com/5016453/233663689-f1b7ceb3-782f-4f64-9040-bef321fd7a95.mp4

Havi-muro commented 1 year ago

Both packages are great! Leaving the results in raster format seems to be more efficient for very large areas. Is there any example on prompting SAM with a list of points? As the original poster said, SAM is not very good with fine linear features if not prompted.

EDIT: I'll answer myself: you just have to provide a np.ndarray with the pixel coordinates (normalized 0-1) and pass it on the kwargs 'point_grids' as a list. However, the first results are not too great. It leaves the linear features as part of a matrix, rather than creating segments for them. Below are my prompting points and the results: image

Radhika-Keni commented 1 year ago

Hi @Khoo Yong Yong , In response to your question ,I did not run an inference on their code directly therefore I do not know what the corresponding parameter for 'segment everything' option would be when running their inference method directly through their code base API. I used only the graphical API(UI) to run inferences . I did notice that a couple of people have asked the same question that you have asked on their 'issues' tab , you may want to do a search to see if the either the SAM team has replied to them or they have figured it out themselves !

On Tue, Apr 25, 2023 at 12:45 PM Khoo Yong Yong @.***> wrote:

Hope this helps [image: image] https://user-images.githubusercontent.com/68383273/230651606-67a0e904-4fad-4b0d-bbea-7373964d07d8.png

Hi @Radhika-Keni https://github.com/Radhika-Keni, i would like to know, the exact parameter for the SamAutomaticMaskGenerator for this 'SAM Everything' option. Reason of asking is, when i try it on my local (default param, or with the sample code), the result is different.

The one on the web is way much better than mine.

— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/segment-anything/issues/51#issuecomment-1521270180, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQJXEKIFD5E3CGD4FEIS633XC52ZTANCNFSM6AAAAAAWVPTGKA . You are receiving this because you were mentioned.Message ID: @.***>

yong2khoo-lm commented 1 year ago

Hi @Khoo Yong Yong , In response to your question ,I did not run an inference on their code directly therefore I do not know what the corresponding parameter for 'segment everything' option would be when running their inference method directly through their code base API. I used only the graphical API(UI) to run inferences . I did notice that a couple of people have asked the same question that you have asked on their 'issues' tab , you may want to do a search to see if the either the SAM team has replied to them or they have figured it out themselves ! On Tue, Apr 25, 2023 at 12:45 PM Khoo Yong Yong @.> wrote: Hope this helps [image: image] https://user-images.githubusercontent.com/68383273/230651606-67a0e904-4fad-4b0d-bbea-7373964d07d8.png Hi @Radhika-Keni https://github.com/Radhika-Keni, i would like to know, the exact parameter for the SamAutomaticMaskGenerator for this 'SAM Everything' option. Reason of asking is, when i try it on my local (default param, or with the sample code), the result is different. The one on the web is way much better than mine. — Reply to this email directly, view it on GitHub <#51 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQJXEKIFD5E3CGD4FEIS633XC52ZTANCNFSM6AAAAAAWVPTGKA . You are receiving this because you were mentioned.Message ID: @.>

@Radhika-Keni Appreciate your reply. Ya i realize I may be asking the wrong person, and deleted my post. You are right, I discovered some other posts that are mentioning the same, and will follow up the corresponding posts :)

At the same time, I am also interested if you have found a way to fine-tune or enhance the result.

Radhika-Keni commented 1 year ago

@yong2khoo-lm : I have decided not to try to fine tune SAM . Here' s why : SAM stands for "Segment Anything ", the reason this is revolutionary is because it claims to be able to segment any image and therefore if we have to 'fine tune' it on a dataset to get the results we expect , then are we not defeating its very purpose ? If I have to fine tune a model on my data set for a segmentation task to get satisfactory results , then it makes more sense to me to use the existing models for segmentation such as U-net or even YOLO7/8 for that matter , why would I use SAM ? :-) That s my take on fine tuning SAM but I'd love to know what you'll think on the same ? !!

Radhika-Keni commented 1 year ago

Both packages are great! Leaving the results in raster format seems to be more efficient for very large areas. Is there any example on prompting SAM with a list of points? As the original poster said, SAM is not very good with fine linear features if not prompted.

EDIT: I'll answer myself: you just have to provide a np.ndarray with the pixel coordinates (normalized 0-1) and pass it on the kwargs 'point_grids' as a list. However, the first results are not too great. It leaves the linear features as part of a matrix, rather than creating segments for them. Below are my prompting points and the results: image

Thanks for sharing this . Could you please share the original image(un-segmented) alongside the segmented image results ? Its hard to figure out how well SAM did on this particular image because its unclear what the original image is capturing just from this shot

Havi-muro commented 1 year ago

Sure! The yellow dots are the prompts. These are the features I'd like segmented, but they are all embedded in the matrix, along with that city to the south. I have changed some of the SAM parameters, but the results are always very similar. image

Radhika-Keni commented 1 year ago

@Havi-muro : Thanks much for sharing this ! I would love to try this out on my dataset . Could you please share how much denser you made their original grid before you ran the inference ? For example , lets say , by default , SAM runs an inference with a XX grid of prompt points for the "segment everything" option . I would love to know how much did you increase this grid by ? for example ,(X+y) (X+y).

Havi-muro commented 1 year ago

@Havi-muro : Thanks much for sharing this ! I would love to try this out on my dataset . Could you please share how much denser you made their original grid before you ran the inference ? For example , lets say , by default , SAM runs an inference with a XX grid of prompt points for the "segment everything" option . I would love to know how much did you increase this grid by ? for example ,(X+y) (X+y).

There are 10.000 prompting points and the image is 4096x4096 pixels of 3 m each. I think that the default is 32 points per side, and 64 points per batch (I'm unsure about what this last one means). So, I think I multiplied the density by 10 (For no particular reason, those are just the training points I have for a prediction model)

Radhika-Keni commented 1 year ago

Gotchya !! Thanks much for sharing , appreciate it !!

sunwarm2001 commented 1 year ago

Hi @Khoo Yong Yong , In response to your question ,I did not run an inference on their code directly therefore I do not know what the corresponding parameter for 'segment everything' option would be when running their inference method directly through their code base API. I used only the graphical API(UI) to run inferences . I did notice that a couple of people have asked the same question that you have asked on their 'issues' tab , you may want to do a search to see if the either the SAM team has replied to them or they have figured it out themselves ! On Tue, Apr 25, 2023 at 12:45 PM Khoo Yong Yong @.**> wrote: Hope this helps [image: image] https://user-images.githubusercontent.com/68383273/230651606-67a0e904-4fad-4b0d-bbea-7373964d07d8.png Hi @Radhika-Keni https://github.com/Radhika-Keni, i would like to know, the exact parameter for the SamAutomaticMaskGenerator for this 'SAM Everything' option. Reason of asking is, when i try it on my local (default param, or with the sample code), the result is different. The one on the web is way much better than mine. — Reply to this email directly, view it on GitHub <#51 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQJXEKIFD5E3CGD4FEIS633XC52ZTANCNFSM6AAAAAAWVPTGKA . You are receiving this because you were mentioned.Message ID: @.**>

@Radhika-Keni Appreciate your reply. Ya i realize I may be asking the wrong person, and deleted my post. You are right, I discovered some other posts that are mentioning the same, and will follow up the corresponding posts :)

At the same time, I am also interested if you have found a way to fine-tune or enhance the result.

follow

wdc233 commented 7 months ago

确定!黄点是提示。这些是我想分割的特征,但它们都嵌入在矩阵中,以及南部的那个城市。我更改了一些 SAM 参数,但结果总是非常相似。 image

Hi, Please how do you generate these prompt points? Is it based on the ground truth? good wish!

Havi-muro commented 6 months ago

确定!黄点是提示。这些是我想分割的特征,但它们都嵌入在矩阵中,以及南部的那个城市。我更改了一些 SAM 参数,但结果总是非常相似。 image

Hi, Please how do you generate these prompt points? Is it based on the ground truth? good wish!

Ground truth, yes.