xtreme1-io / xtreme1

Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
https://www.basic.ai
Apache License 2.0
855 stars 141 forks source link

data_annotation_object table doesn't have model id column. #235

Closed chanyoung1998 closed 5 months ago

chanyoung1998 commented 5 months ago

there are two ways of creating BoundingBox in 3d lidar annotation or any annotation task. Use model or Do that in manual.

And my team requires to identify which model create which BoundingBox. I would like to know why the data_annotation_object table does not have a column model_id? If it is acceptable to add model_id to the table, can i make pull request related to that function?

jaggerwang commented 5 months ago

The source_id, for source_type "MODEL", in table data_annotation_object is just the primary id id in table model_run_record, and then you can get the model_id of the specific model run record.

chanyoung1998 commented 5 months ago

@jaggerwang

Thanks for your comment! More specifically, as i know , model_run_record is inserted and data_annotation_object is inserted as "MODEL" type when i go into the Model tab and click on the Run Model button for a specific model

But Annotation Interface also has run model button that looks like a "brain" . When i click that button, model_data_result is inserted. After this kind of flow of annotating, when i save or submit this annotation results , then model_data_result is removed and data_annotation_object is inserted as "DATA_FLOW". So there is no way to identify model_id of this kind of data_annotation_object.

So i am wondering if it was meant to be , and if not, i carefully suggest that data_annotation_object is inserted as "MODEL" and model_data_result is retained or new model_run_record is inserted. Or instead of removing a row of model_data_result , what about just removing model_result column data and adding is_deleted column.

What do you think of it?

jaggerwang commented 5 months ago

I'm sorry for the misunderstanding. What you meant is to call the model for individual data in the tool, not to call in bulk in the model page. The running methods of these two are different in terms of the way the results are stored. When called in bulk, the model results will be written directly into the database. However, the results of individual calls in the tool will first be confirmed by the annotator, and the annotator may make some modifications, then save them to the database, so they will be treated as manually annotated results rather than model annotated results. The reason why the results of individual calls in the tool are written into the database is because the call here is asynchronous, and there needs to be a place to temporarily store the model results and wait for the front end to get them through polling. Once the front end gets them, they are no longer needed.

chanyoung1998 commented 5 months ago

Thank you for your kind opinion! I fully understand the difference between the two uses of "model run," and agree with this method of processing data_annotation_object.