pylabel-project / pylabel

Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo.
MIT License
317 stars 56 forks source link

AssertionError: Output shape does not match input shape. Data loss has occured. #140

Closed R-N closed 11 months ago

R-N commented 11 months ago

Code:

dataset = importer.ImportYoloV5("labels", path_to_images="../images")
dataset.splitter.StratifiedGroupShuffleSplit(train_pct=.8, val_pct=.0, test_pct=.2, batch_size=1)

Environment:

Error: --------------------------------------------------------------------------- AssertionError Traceback (most recent call last) Cell In[13], line 1 ----> 1 dataset.splitter.StratifiedGroupShuffleSplit(train_pct=.8, val_pct=.0, test_pct=.2, batch_size=1) File ~\AppData\Roaming\Python\Python310\site-packages\pylabel\splitter.py:223, in Split.StratifiedGroupShuffleSplit(self, train_pct, test_pct, val_pct, weight, group_col, cat_col, batch_size) 218 df_val["split"] = "val" 220 df = pd.concat([df_train, pd.concat([df_test, df_val])]) 222 assert ( --> 223 df.shape == df_main.shape 224 ), "Output shape does not match input shape. Data loss has occured." 226 self.dataset.df = df 227 self.dataset.df = self.dataset.df.reset_index(drop=True) AssertionError: Output shape does not match input shape. Data loss has occured.

I have no idea what that means and what I should (or shouldn't) do.

alexheat commented 11 months ago

@R-N I have resolved the issue in the latest version.

R-N commented 11 months ago

@alexheat Thank you. The error went away, but ShowClassSplits is empty.

image

R-N commented 11 months ago

I see the problem. dataset.df["cat_name"] is empty

R-N commented 11 months ago

Okay it seems to be empty from the start, right after loading, so it's a different issue.

image