activeloopai / deeplake

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
https://activeloop.ai
Mozilla Public License 2.0
7.88k stars 607 forks source link

Added "allow_new_labels" flag for class_labels tensor #2830

Open nvoxland-al opened 2 months ago

nvoxland-al commented 2 months ago

πŸš€ πŸš€ Pull Request

Impact

Description

By default, tensors with htype="class_label" will accept any new values added to them. This PR adds a new allow_new_labels=False setting in the tensor "info" which changes that behavior to instead throw an exception if an unknown label is added.

The available labels are set in the info["class_names"] setting, either when the tensor is originally created:

        ds.create_tensor(
            "labels",
            htype="class_label",
            class_names=["cat", "dog", "horse"],
            allow_new_labels=False,
        )

or set/updated later:

ds.labels.info.update(allow_new_labels=False)
ds.labels.info.update(class_names=["cat", "dog", "horse")

Things to be aware of

If you update the class_names to be a different order or skipping existing labels, the label_id->text mapping will be off and reading from the tensor will give you incorrect results.

sonarcloud[bot] commented 2 months ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud