microsoft / GlobalMLBuildingFootprints

Worldwide building footprints derived from satellite imagery
Other
1.38k stars 202 forks source link

Trucks classified as buildings #87

Open tjukanovt opened 9 months ago

tjukanovt commented 9 months ago

I am seeing many places where large cars or trucks have been classified as buildings. I wonder if this is a known issue and if there would be something that could be done to improve this. I do understand that these are very tricky objects to properly separate from real buildings. This can be seen in several places but here is an example from Germany (at 50.9605969,11.8508165)

image

I've seen this in multiple countries and various locations.

### Tasks
andwoi commented 7 months ago

Thanks for raising. Our models do have difficulty confusing some trucks and containers with buildings. In the near term, we're collecting these false positives and manually excluding and plan to include these into model retraining cycles. As you do encounter, raise an issue and include a WGS 84 (epsg 4326) bounding box and we'll remove the features from the source.

tjukanovt commented 7 months ago

Thank you for your response @andwoi. I got inspired by the approach taken by in @stefankinell in https://github.com/microsoft/GlobalMLBuildingFootprints/issues/94.

So I compared the latest MSFT buildings to the latest official building footprints from the city of Helsinki. I did a 5 m buffer around the official buildings and intersected them with the MSFT buildings. The ones which didn't find a match were classified as false positives. Then the ones from official buildings (without buffering) which didn't find a match from the MSFT footprints were classified as false negatives. These are much less of a problem than false positives IMO and I understand that there won't be a perfect match ever, due to the temporal resolution of the imagery.

But I have seen these in many geographies and this is just a sample from one city. Please take a look and let me know if you can use this in your analysis 👇 false_detections_helsinki.zip

Here you can see some of these in QGIS. Orange ones are the false positives. Most of the ones in this image are probably cars in parking lots. image

andwoi commented 7 months ago

@tjukanovt thanks for putting this together. We'll take a look offline. While some FP are expected, we want to minimize the most glaring errors.

tjukanovt commented 7 months ago

@andwoi thanks. Definitely not expecting a 100% match on either FP/FN but would be great if some of the most obvious mistakes could be fixed. My personal feeling is that it's much less of an issue to not have something there rather than having a FP. Have you considered exposing the reliability value as an attribute for each feature?