TRI-ML / dgp

ML Dataset Governance Policy for Autonomous Vehicle Datasets
https://tri-ml.github.io/dgp/
MIT License
93 stars 63 forks source link

Missing 'key_line_2d' in `ONTOLOGY_REGISTRY` #71

Closed nehalmamgain closed 2 years ago

nehalmamgain commented 2 years ago

Since 'key_line_2d' is not defined in the 'ONTOLOGY_REGISTRY', an Exception is generated when instantiating FrameSceneDataset()

FrameSceneDataset(
/usr/local/lib/python3.8/dist-packages/dgp/datasets/frame_dataset.py:211: in __init__
    dataset_metadata = DatasetMetadata.from_scene_containers(scenes, requested_annotations, requested_autolabels)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

cls = <class 'dgp.datasets.base_dataset.DatasetMetadata'>
scene_containers = [SceneContainer[<path_to_scene>][Samples: 100], SceneContainer[<path_to_scene>][Samples: 100], SceneContainer[<path_to_scene>][Samples: 100], ...]
requested_annotations = ['key_line_2d'], requested_autolabels = []

    @classmethod
    def from_scene_containers(cls, scene_containers, requested_annotations=None, requested_autolabels=None):
        """Load DatasetMetadata from Scene Dataset JSON.

        Parameters
        ----------
        scene_containers: list of SceneContainer
            List of SceneContainer objects.

        requested_annotations: List(str)
            List of annotations, such as ['bounding_box_3d', 'bounding_box_2d']

        requested_autolabels: List(str)
            List of autolabels, such as['model_a/bounding_box_3d', 'model_a/bounding_box_2d']
        """
        assert len(scene_containers), 'SceneContainers is empty.'
        requested_annotations = [] if requested_annotations is None else requested_annotations
        requested_autolabels = [] if requested_autolabels is None else requested_autolabels

        if not requested_annotations and not requested_autolabels:
            # Return empty ontology table
            return cls(scene_containers, directory=os.path.dirname(scene_containers[0].directory), ontology_table={})
        # For each annotation type, we enforce a consistent ontology across the
        # dataset (i.e. 2 different `bounding_box_3d` ontologies are not
        # permitted). However, an autolabel may support a different ontology
        # for the same annotation type. For example, the following
        # ontology_table is valid:
        # {
        #   "bounding_box_3d": BoundingBoxOntology,
        #   "bounding_box_2d": BoundingBoxOntology,
        #   "my_autolabel_model/bounding_box_3d": BoundingBoxOntology
        # }
        dataset_ontology_table = {}
        logging.info('Building ontology table.')
        st = time.time()

        # Determine scenes with unique ontologies based on the ontology file basename.
        unique_scenes = {
            os.path.basename(f): scene_container
            for scene_container in scene_containers
            for _, _, filenames in os.walk(os.path.join(scene_container.directory, ONTOLOGY_FOLDER)) for f in filenames
        }
        # Parse through relevant scenes that have unique ontology keys.
        for _, scene_container in unique_scenes.items():
            for ontology_key, ontology_file in scene_container.ontology_files.items():
                # Keys in `ontology_files` may correspond to autolabels,
                # so we strip those prefixes when instantiating `Ontology` objects
                _autolabel_model, annotation_key = os.path.split(ontology_key)

                # Look up ontology for specific annotation type
                if annotation_key in ONTOLOGY_REGISTRY:

                    # Skip if we don't require this annotation/autolabel
                    if _autolabel_model:
                        if ontology_key not in requested_autolabels:
                            continue
                    else:
                        if annotation_key not in requested_annotations:
                            continue

                    ontology_spec = ONTOLOGY_REGISTRY[annotation_key]

                    # No need to add ontology-less tasks to the ontology table.
                    if ontology_spec is None:
                        continue

                    # If ontology and key have not been added to the table, add it.
                    if ontology_key not in dataset_ontology_table:
                        dataset_ontology_table[ontology_key] = ontology_spec.load(ontology_file)

                    # If we've already loaded an ontology for this annotation type, make sure other scenes have the same ontology
                    else:
                        assert dataset_ontology_table[ontology_key] == ontology_spec.load(
                            ontology_file
                        ), "Inconsistent ontology for key {}.".format(ontology_key)

                # In case an ontology type is not implemented yet
                else:
>                   raise Exception(f"Ontology for key {ontology_key} not found in registry!")
E                   Exception: Ontology for key key_line_2d not found in registry!

/usr/local/lib/python3.8/dist-packages/dgp/datasets/base_dataset.py:592: Exception
quincy-kh-chen commented 2 years ago

cc @sshusainTRI Let's add KeyLineOntology that derives directly from BoundingBoxOntology in file dgp/annotations/ontology.py, and KeyLine2DAnnotationList in a new file dgp/annotations/key_line_2d_annotation.py.

quincy-kh-chen commented 2 years ago

@sshusainTRI opened #76 for the fix FYI @nehalmamgain

wadimkehl commented 2 years ago

@nehalmamgain Can we close this?

nehalmamgain commented 2 years ago

Solved with https://github.com/TRI-ML/dgp/pull/76