hotosm / fAIr

fAIr - AI Assisted Mapping Tool
https://fair.hotosm.org/
GNU Affero General Public License v3.0
84 stars 61 forks source link

Area of Interest ==> Training Area #214

Open Ettrig opened 9 months ago

Ettrig commented 9 months ago

The expression AOI is not informative as it is used in fAIr. Since it is used so many times, the user will understand the meaning eventually. But a more informative expression would help the user understand how fAIr works.

An AOI is an area from which data is generated to be used for complementary (fine-tuning) training of the model. The central word here is training. The AOI should be called Training Area (TA).

There is a concrete problem with the expression AOI. It is used in the same context with another meaing. In the manual for people who create projects in TA, the AOI is the whole area to be mapped. It would be better to call it Area to be Mapped. But at least it is not wrong. And this is the given background for us.

AOI is misleading. The user is not interested. fAIr is not particularly interested. It is important that the Training Area is representative. This means that we want there to be nothing special with the Training areas. We want the buildings here to be just like the the vast majority of the other buildings (for now) in the area to be mapped (using this model). So we do not want the user to choose interesting areas, but boring areas.

Here is a sentence where it is used: "A dataset would be a list of area of interests (AIOs) and its labels." Notice how an important concept is introduced using "AOI" and "label" to convey the meaning. But AOI is used to refer to a concept that we have invented and label is used with a meaning that is a deep implementation detail in CNNs. To sort this out we need to distinguish at least three different things.

1) The real area, a part of the surface of the Earth. 2) An image of that area. 3) The OSM description of that area.

In the sentence above, it sounds like 1) is intended. That is, that is what one would think if one didn't know what you mean. That is, if one needs this explanation. Rewrite:

"A training dataset is a set of images of areas of interest (AOIs) and corresponding OSM map data."

Or, if I can persuade you to let go of this use of AOI:

"A training dataset is a set of images of training areas (TAs) and corresponding OSM map data."

omranlm commented 6 months ago

I think is is already done, right? @kshitijrajsharma @Ettrig