voxel51 / fiftyone

Refine high-quality datasets and visual AI models
https://fiftyone.ai
Apache License 2.0
8.9k stars 565 forks source link

[BUG] Fields in embedded docs with mixed types crash field resolution in fiftyone.core.odm.utils._merge_embedded_doc_fields #4654

Open gabmis opened 3 months ago

gabmis commented 3 months ago

Describe the problem

It seems the problem (TypeError) occurs when you have fields with mixed types in an embedded document. E.g. detections with a user defined field named "size" which has both ints and floats. I don't know the expected behavior here but the way the code fails seems unsatisfactory in any case, in informative error could be the solution. The code responsible is the function mentioned in the title and in particular this line. which sets the field to None in the fields_dict when seeing a type mismatch. Then, when the field comes up again (e.g. when parsing the subsequent detection) this line will try to perform __getitem__("size") on a NoneType.

Code to reproduce issue

I don't have the time right now but if you think it necessary I'll take the time later to provide a minimal example.

System information

Other info/logs

N/A

Willingness to contribute

The FiftyOne Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the FiftyOne codebase?

brimoor commented 3 months ago

@gabmis can you please provide the minimal example so we can see what you're trying to accomplish? Not yet sure how to proceed 😄

gabmis commented 2 months ago

@brimoor thanks for following up and sorry for the late response.

Unfortunately, I haven't found the time to create the example. I can describe it shortly though.

I had a field containing fo.Detections which themselves had a confidence field that took both int and float values across my dataset. This lead to an error when I was adding my samples to my dataset and the error was the one mentioned above.

Ideally, an error message telling me the specific field causing the issue and explaining the fact that the user is responsible for ensuring type consistency.

Something similar is discussed in the Dyanmic Attribute documentation which I'm guessing is related.

Sorry I can't do more to help, feel free to close if the current information I've provided is not actionable enough.