OSGeo / grass

GRASS GIS - free and open-source geospatial processing engine
https://grass.osgeo.org
Other
814 stars 297 forks source link

[Feat] i.group: print image group content with semantic_labels in JSON #3750

Open ninsbl opened 2 months ago

ninsbl commented 2 months ago

Is your feature request related to a problem? Please describe. The concept of semantic labels got introduced with the temporal framework, were it`s original idea was to flag bands in e.g. satellite imagery. However, this concept can be very handy when handling all kinds of imagery data.

In order to make better use of it in the imagery group context, it would be great if i.group could print not only the maps in a group, but also their semantic labels. In scripts I use a python function to get the the semantic labels of maps in a group (see below), and I was thinking about adding it to the python library, However, looking at i.group, I think it would be more appropriate/convenient to extend the printing functionality of the i.group module.

Describe the solution you'd like i.group allows printing of files in (sub) groups (-l) as well as subgroups (-s) in verbose and "shell script" style (-g), which actually is rather a plain list. What I would like to see is: a) either a new flag (e.g. -S) to request printing of semantic labels of the maps in the group or a print option (instead of the -l and -s flags) and b) a _G_OPT_FFORMAT option to request different output formats (instead of the -g flag).

Ideally, the current flags for printing get deprecated (but kept functional (with a warning to the user) until rempoval) in favor of a combination of print and format option

So, I would like to be able to do something like this:

# prepare input
import grass.script as gs
band_numbers = [1, 2, 3]
for band_number in band_numbers:
    gs.run_command("g.copy", raster=f"lsat7_2000_{band_number}0,lsat7_2000_{band_number}0")
    gs.run_command("r.support", raster="lsat7_2000_{band_number}0", semantic_label="L8_{band_number}")
gs.run_command("i.group", group="L8_group", input=",".join([f"lsat7_2000_{band_number}0" for band_number in band_numbers]))

And then

# Print group info (labels as keys, maps as values)
gs.run_command("i.group", group="L8_group", print="labels_and_maps", format="json")

Which should return something along the following lines:

{
    "L8_1": "lsat7_2000_10@user",
    "L8_2": "lsat7_2000_20@user",
    "L8_3": "lsat7_2000_30@user"
}

or

# Print group info (maps as keys, labels as values)
gs.run_command("i.group", group="L8_group", print="maps_and_labels", format="json")

Which should return something along the following lines:

{  
  "lsat7_2000_10@user": "L8_1",
  "lsat7_2000_20@user": "L8_2",
  "lsat7_2000_30@user": "L8_3"
}

or

# Print group info (just subgroups)
gs.run_command("i.group", group="L8_group", print="subgroups", format="json")

Which should return something along the following lines assuming two subgroups (if subgroups are still relevant):

["subgroup_a", "subgroup_b"]

or

# Print group info (group structure, assuming two subgrups, if subgroups are still relevant)
gs.run_command("i.group", group="L8_group", print="subgroups_maps_and_labels", format="json")  # If subgroups still are relevant

Which should return something along the following lines:

{
    "subgroup_a": {
        "lsat7_2000_10@user": "L8_1",
        "lsat7_2000_20@user": "L8_2"
    },
    "subgroup_b": {
           "lsat7_2000_20@user": "L8_2",
           "lsat7_2000_30@user": "L8_3"
    }
}

Describe alternatives you've considered An alternative would be to add a function to the Python library, like the following:

def group_to_dict(imagery_group_name: str, keys="labels", env=None, **args):
    """Create a dictionary to represent an imagery group, where raster maps
    in the imagery group are the values and the associated semantic_labels
    are their respective keys.
    For raster maps in the imagery group that do not have a semantic label
    a warning is given and if the imagery_group_name is None, an empty
    dictionary is returned.

    :param str imagery_group_name: Name of the imagery group to process (or None)
    :param str keys: 
    :param env: environment

    :return: dictionary with maps and their semantic labels (or row indices in the imagery group)
    :rtype: dict
    """
    """
    """
    group_dict = {}
    if not imagery_group_name:
        return group_dict
    try:
        maps_in_group = (
            gs.read_command("i.group", group=imagery_group_name, flags="g", quiet=True, env=env)
            .strip()
            .split()
        )
    except CalledModuleError:
        gs.fatal(_("Could not get maps from imagery group <{}>").format(imagery_group_name))

    for idx, raster_map in enumerate(maps_in_group):
        raster_map_info = gs.raster_info(raster_map)
        semantic_label = raster_map_info["semantic_label"]
        if not raster_map_info["semantic_label"]:
            warning(
                _(
                    "Raster map {rmap} in group {igroup} does not have a semantic label."
                    "Using the numeric index in the group"
                ).format(rmap=raster_map, igroup=imagery_group_name)
            )
            semantic_label = idx + 1
        if 
        group_dict[semantic_label] = raster_map
    return group_dict

Additional context The background is the need to match imagery groups to machine learning models using the semantic labels of an imagery group (instead of e.g. relying on the order of the maps in the group).

ninsbl commented 2 months ago

@wenzeslaus which avenue do you suggest we follow here? I could add a python function myself (e.g. in an imagery module in grass.script), but I am not good enough with C to implement more sophisticated printing in i.group. Do you think this could be addressed by @kritibirda26 as part of the JSON GSoC project (you probably have a full schedule already)?

@marisn do you have any suggestions / comments here? You did quite some work on imagery groups recently...

ninsbl commented 2 months ago

On a second thought, since I do need the full raster info and not only the semantic label of maps in a group for my use-case anyway, a python function is probably more suitable / appropriate...

wenzeslaus commented 2 months ago

Wrapper function vs JSON output: Many wrapper functions in grass.script are mainly workarounds for missing JSON output (and avoid strange syntax). I would have to check to see the priorities for GSoC.

If you need to wrap multiple tools together, then the question is if we want that as a function in grass.script. Of course, that does not remove the need for JSON output of i.group.

marisn commented 2 months ago

As i.group does not have a JSON output, we are free to implement as we wish. It does not contradict of having a Python function too. I probably would not touch existing print flags to not break backward compatibility but just implement a new print functionality. I think we should go with a plain G_OPT_F_FORMAT option – when it is set, we print, when absent – don't. Back in a day there was an initiative to abandon subgroups (too lazy to search in Trac for #) and thus I would probably go for printing group and all its subgroups in one go ignoring subgroup parameter. As for the actual JSON structure I do not have any preference at the moment as I haven't tried to implement any code where it would be required and thus I do not see how it should look. Only we must keep in mind that not all rasters have a semantic label.