stac-extensions / classification

Describes categorical values and bitfields to give values in a file a certain meaning (classification).
Apache License 2.0
11 stars 3 forks source link

List versus Dictionary #13

Closed pjhartzell closed 2 years ago

pjhartzell commented 2 years ago

The current structure of both classification:classes and classification:bitfields is a list of dictionaries:

"classification:bitfields": [
    {
        "name": "no_data",
        "description": "no data mask",
        "first_bit": 0,
        "bit_count": 1,
        "classes": [
            {
                "name": "no_data",
                "value": 0,
                "description": "no data in pixel"
            },
            {
                "name": "data",
                "value": 1,
                "description": "data in pixel"
            }
        ]
    },
    {
        "name": "cloud_confidence",
        "description": "cloud confidence levels",
        "first_bit": 8,
        "bit_count": 2,
        "classes": [
            {
                "name": "none",
                "value": 0,
                "description": "no confidence level set"
            },
            {
                "name": "low",
                "value": 1,
                "description": "low confidence"
            },
            {
                "name": "medium",
                "value": 2,
                "description": "medium confidence"
            },
            {
                "name": "high",
                "value": 3,
                "description": "high confidence"
            }
        ]
    }
]

Is there any preference to using/not using a dictionary of dictionaries instead?

"classification:bitfields": {
    "no_data": {
        "description": "no data mask",
        "first_bit": 0,
        "bit_count": 1,
        "classes": {
            "no_data": {
                "value": 0,
                "description": "no data in pixel"
            },
            "data": {
                "value": 1,
                "description": "data in pixel"
            }
        }
    },
    "cloud_confidence": {
        "description": "cloud confidence levels",
        "first_bit": 8,
        "bit_count": 2,
        "classes": {
            "none": {
                "value": 0,
                "description": "no confidence level set"
            },
            "low": {
                "value": 1,
                "description": "low confidence"
            },
            "medium": {
                "value": 2,
                "description": "medium confidence"
            },
            "high": {
                "value": 3,
                "description": "high confidence"
            }
        }
    }
}

Whereas name is an optional field for machine readability in the current list format, it would be "required" in a dictionary format since it essentially used as the key.

I thought this topic had come up at some point, but am unable to find the original comments.

drwelby commented 2 years ago

I'm not sure it went either but this format popped at at some point since it parallels the asset structure. I prefer it but went with the simpler list.

Could the Landsat bit fields descriptions be shortened to good dictionary keys, if some don't already exist?

m-mohr commented 2 years ago

From previous experience in STAC lists are usually favorable as they are easier to validate, easier to summarize, and easier to implement. (And there are still people arguing that making assets an object was a bad decision.)

pjhartzell commented 2 years ago

Easier is good. Let's stay with lists.

drwelby commented 2 years ago

Agreed.