allenai / satlas

Apache License 2.0
184 stars 19 forks source link

How to interpret a category key in vector.json whose value is an empty list? #39

Closed alimkarimi closed 4 months ago

alimkarimi commented 4 months ago

Hi there - this is very awesome work!

I am looking through your dynamic dataset and noticed that a good amount of vector.json files contain keys of different categories (i.e, vessel), but the value is an empty list.

I noticed in your README that you say:

"Note that only a subset of categories are annotated in each label folder. Oftentimes categories will be annotated but have no instances present in the tile and/or time range, in which case they will appear in vector.json like this:

"power_substation": [],

If the category is not annotated at all, then it will omit the key in vector.json entirely (or, for segmentation and regression labels, omit the PNG image like no land_cover.png)."

However, I don't understand what this means for how one can leverage the dataset in the training process? Power substation and vessel are part of the polygon labels, so I assume they can be used for object detection and localization type algorithms. But if these keys do not contain a polygon, does that mean there is not a polygon for available for training? I feel like I am missing something, because I don't quite understand why there are empty lists as part of the these types of keys.

Thanks!

favyen2 commented 4 months ago

If a tile's labels includes a key like "power_substation" or "vessel", and the corresponding value is an empty list, it means that the tile can be used for training the prediction heads for those categories as a negative example (train the model to predict no power substations or vessels in that tile).

If the key is missing, it means the labels for that category are unknown in that tile, so it should not be used for training at all (mask out that prediction head).

alimkarimi commented 4 months ago

@favyen2 thanks for clarifying!