ocean-data-factory-sweden / kso

Notebooks to upload/download marine footage, connect to a citizen science project, train machine learning models and publish marine biological observations.
GNU General Public License v3.0
4 stars 12 forks source link

UnboundLocalError in Process classifications notebook #416

Closed Bergylta closed 5 days ago

Bergylta commented 6 days ago

🐛 Bug

To Reproduce (REQUIRED)

Choose location of output /cache/album/cache/kso-user/bucket/tmp_dir/Emil_testing

Input:

# Run the preparation script
mlp.prepare_dataset(
    agg_df=pp.aggregated_zoo_classifications,
    out_path=output_folder.selected,
    img_size=(720, 540),
    perc_test=percentage_test.value,
    out_format="yolo",
    track_frames=True,

Output:

UnboundLocalError                         Traceback (most recent call last)
File /cache/album/cache/kso-user/kso/kso_utils/project.py:1421, in MLProjectProcessor.prepare_dataset.<locals>.on_button_clicked(b)
   1419 self.species_of_interest = species_list.value
   1420 # code for prepare dataset for machine learning
-> 1421 self.modules["yolo_utils"].frame_aggregation(
   1422     project=self.project,
   1423     server_connection=self.server_connection,
   1424     db_connection=self.db_connection,
   1425     out_path=out_path,
   1426     perc_test=perc_test,
   1427     class_list=self.species_of_interest,
   1428     img_size=img_size,
   1429     remove_nulls=remove_nulls,
   1430     track_frames=track_frames,
   1431     n_tracked_frames=n_tracked_frames,
   1432     agg_df=agg_df,
   1433     out_format=out_format,
   1434 )

File /cache/album/cache/kso-user/kso/kso_utils/yolo_utils.py:842, in frame_aggregation(project, server_connection, db_connection, out_path, perc_test, class_list, img_size, out_format, remove_nulls, track_frames, n_tracked_frames, agg_df)
    839             f_group_fields = ["filename"]
    841 if out_format in ["yolo", "yolo-seg"]:
--> 842     col_list = list(full_rows.columns)
    843     fw_pos, fh_pos, speciesid_pos = (
    844         col_list.index("f_w"),
    845         col_list.index("f_h"),
    846         col_list.index("species_id"),
    847     )
    849     if out_format == "yolo":

UnboundLocalError: local variable 'full_rows' referenced before assignment
12:01:59
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
File /cache/album/cache/kso-user/kso/kso_utils/project.py:1421, in MLProjectProcessor.prepare_dataset.<locals>.on_button_clicked(b)
   1419 self.species_of_interest = species_list.value
   1420 # code for prepare dataset for machine learning
-> 1421 self.modules["yolo_utils"].frame_aggregation(
   1422     project=self.project,
   1423     server_connection=self.server_connection,
   1424     db_connection=self.db_connection,
   1425     out_path=out_path,
   1426     perc_test=perc_test,
   1427     class_list=self.species_of_interest,
   1428     img_size=img_size,
   1429     remove_nulls=remove_nulls,
   1430     track_frames=track_frames,
   1431     n_tracked_frames=n_tracked_frames,
   1432     agg_df=agg_df,
   1433     out_format=out_format,
   1434 )

File /cache/album/cache/kso-user/kso/kso_utils/yolo_utils.py:842, in frame_aggregation(project, server_connection, db_connection, out_path, perc_test, class_list, img_size, out_format, remove_nulls, track_frames, n_tracked_frames, agg_df)
    839             f_group_fields = ["filename"]
    841 if out_format in ["yolo", "yolo-seg"]:
--> 842     col_list = list(full_rows.columns)
    843     fw_pos, fh_pos, speciesid_pos = (
    844         col_list.index("f_w"),
    845         col_list.index("f_h"),
    846         col_list.index("species_id"),
    847     )
    849     if out_format == "yolo":

UnboundLocalError: local variable 'full_rows' referenced before assignment

Expected behavior

A clear and concise description of what you expected to happen.

Additional context

Getting the UnboundLocalError in the terminal, notebook does not show error code I might be messing up with the location of the output here i think

jannesgg commented 6 days ago

@Bergylta Try to do a git pull and retry this.

Bergylta commented 6 days ago

I think that fixed it, unfortunately, there is of course other errors that pop up at the same location @jannesgg

File [~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:982](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py#line=981), in _finalize_columns_and_data(content, columns, dtype)
    981 try:
--> 982     columns = _validate_or_indexify_columns(contents, columns)
    983 except AssertionError as err:
    984     # GH#26429 do not raise user-facing AssertionError

File [~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:1030](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py#line=1029), in _validate_or_indexify_columns(content, columns)
   1028 if not is_mi_list and len(columns) != len(content):  # pragma: no cover
   1029     # caller's responsibility to check for this...
-> 1030     raise AssertionError(
   1031         f"{len(columns)} columns passed, passed data had "
   1032         f"{len(content)} columns"
   1033     )
   1034 elif is_mi_list:
   1035 
   1036     # check if nested list column, length of each sub-list should be equal

AssertionError: 8 columns passed, passed data had 9 columns

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
File /cache/album/cache/kso-user/kso/kso_utils/project.py:1421, in MLProjectProcessor.prepare_dataset.<locals>.on_button_clicked(b)
   1419 self.species_of_interest = species_list.value
   1420 # code for prepare dataset for machine learning
-> 1421 self.modules["yolo_utils"].frame_aggregation(
   1422     project=self.project,
   1423     server_connection=self.server_connection,
   1424     db_connection=self.db_connection,
   1425     out_path=out_path,
   1426     perc_test=perc_test,
   1427     class_list=self.species_of_interest,
   1428     img_size=img_size,
   1429     remove_nulls=remove_nulls,
   1430     track_frames=track_frames,
   1431     n_tracked_frames=n_tracked_frames,
   1432     agg_df=agg_df,
   1433     out_format=out_format,
   1434 )

File /cache/album/cache/kso-user/kso/kso_utils/yolo_utils.py:817, in frame_aggregation(project, server_connection, db_connection, out_path, perc_test, class_list, img_size, out_format, remove_nulls, track_frames, n_tracked_frames, agg_df)
    815 if out_format == "yolo":
    816     column_names.extend(["x", "y", "w", "h"])
--> 817     full_rows = pd.DataFrame(new_rows, columns=column_names)
    818 elif out_format == "yolo-seg":
    819     # Determine the maximum number of (x, y) pairs in new_rows
    820     max_num_points = max(len(entry[4:]) for entry in new_rows)

File [~/.local/lib/python3.10/site-packages/pandas/core/frame.py:722](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/frame.py#line=721), in DataFrame.__init__(self, data, index, columns, dtype, copy)
    717     if columns is not None:
    718         # error: Argument 1 to "ensure_index" has incompatible type
    719         # "Collection[Any]"; expected "Union[Union[Union[ExtensionArray,
    720         # ndarray], Index, Series], Sequence[Any]]"
    721         columns = ensure_index(columns)  # type: ignore[arg-type]
--> 722     arrays, columns, index = nested_data_to_arrays(
    723         # error: Argument 3 to "nested_data_to_arrays" has incompatible
    724         # type "Optional[Collection[Any]]"; expected "Optional[Index]"
    725         data,
    726         columns,
    727         index,  # type: ignore[arg-type]
    728         dtype,
    729     )
    730     mgr = arrays_to_mgr(
    731         arrays,
    732         columns,
   (...)
    735         typ=manager,
    736     )
    737 else:

File [~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:519](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py#line=518), in nested_data_to_arrays(data, columns, index, dtype)
    516 if is_named_tuple(data[0]) and columns is None:
    517     columns = ensure_index(data[0]._fields)
--> 519 arrays, columns = to_arrays(data, columns, dtype=dtype)
    520 columns = ensure_index(columns)
    522 if index is None:

File [~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:883](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py#line=882), in to_arrays(data, columns, dtype)
    880     data = [tuple(x) for x in data]
    881     arr = _list_to_arrays(data)
--> 883 content, columns = _finalize_columns_and_data(arr, columns, dtype)
    884 return content, columns

File [~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:985](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py#line=984), in _finalize_columns_and_data(content, columns, dtype)
    982     columns = _validate_or_indexify_columns(contents, columns)
    983 except AssertionError as err:
    984     # GH#26429 do not raise user-facing AssertionError
--> 985     raise ValueError(err) from err
    987 if len(contents) and contents[0].dtype == np.object_:
    988     contents = _convert_object_array(contents, dtype=dtype)

ValueError: 8 columns passed, passed data had 9 columns
23:24:55
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
File [~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:982](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py#line=981), in _finalize_columns_and_data(content, columns, dtype)
    981 try:
--> 982     columns = _validate_or_indexify_columns(contents, columns)
    983 except AssertionError as err:
    984     # GH#26429 do not raise user-facing AssertionError

File [~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:1030](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py#line=1029), in _validate_or_indexify_columns(content, columns)
   1028 if not is_mi_list and len(columns) != len(content):  # pragma: no cover
   1029     # caller's responsibility to check for this...
-> 1030     raise AssertionError(
   1031         f"{len(columns)} columns passed, passed data had "
   1032         f"{len(content)} columns"
   1033     )
   1034 elif is_mi_list:
   1035 
   1036     # check if nested list column, length of each sub-list should be equal

AssertionError: 8 columns passed, passed data had 9 columns

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
File /cache/album/cache/kso-user/kso/kso_utils/project.py:1421, in MLProjectProcessor.prepare_dataset.<locals>.on_button_clicked(b)
   1419 self.species_of_interest = species_list.value
   1420 # code for prepare dataset for machine learning
-> 1421 self.modules["yolo_utils"].frame_aggregation(
   1422     project=self.project,
   1423     server_connection=self.server_connection,
   1424     db_connection=self.db_connection,
   1425     out_path=out_path,
   1426     perc_test=perc_test,
   1427     class_list=self.species_of_interest,
   1428     img_size=img_size,
   1429     remove_nulls=remove_nulls,
   1430     track_frames=track_frames,
   1431     n_tracked_frames=n_tracked_frames,
   1432     agg_df=agg_df,
   1433     out_format=out_format,
   1434 )

File /cache/album/cache/kso-user/kso/kso_utils/yolo_utils.py:817, in frame_aggregation(project, server_connection, db_connection, out_path, perc_test, class_list, img_size, out_format, remove_nulls, track_frames, n_tracked_frames, agg_df)
    815 if out_format == "yolo":
    816     column_names.extend(["x", "y", "w", "h"])
--> 817     full_rows = pd.DataFrame(new_rows, columns=column_names)
    818 elif out_format == "yolo-seg":
    819     # Determine the maximum number of (x, y) pairs in new_rows
    820     max_num_points = max(len(entry[4:]) for entry in new_rows)

File [~/.local/lib/python3.10/site-packages/pandas/core/frame.py:722](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/frame.py#line=721), in DataFrame.__init__(self, data, index, columns, dtype, copy)
    717     if columns is not None:
    718         # error: Argument 1 to "ensure_index" has incompatible type
    719         # "Collection[Any]"; expected "Union[Union[Union[ExtensionArray,
    720         # ndarray], Index, Series], Sequence[Any]]"
    721         columns = ensure_index(columns)  # type: ignore[arg-type]
--> 722     arrays, columns, index = nested_data_to_arrays(
    723         # error: Argument 3 to "nested_data_to_arrays" has incompatible
    724         # type "Optional[Collection[Any]]"; expected "Optional[Index]"
    725         data,
    726         columns,
    727         index,  # type: ignore[arg-type]
    728         dtype,
    729     )
    730     mgr = arrays_to_mgr(
    731         arrays,
    732         columns,
   (...)
    735         typ=manager,
    736     )
    737 else:

File [~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:519](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py#line=518), in nested_data_to_arrays(data, columns, index, dtype)
    516 if is_named_tuple(data[0]) and columns is None:
    517     columns = ensure_index(data[0]._fields)
--> 519 arrays, columns = to_arrays(data, columns, dtype=dtype)
    520 columns = ensure_index(columns)
    522 if index is None:

File [~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:883](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py#line=882), in to_arrays(data, columns, dtype)
    880     data = [tuple(x) for x in data]
    881     arr = _list_to_arrays(data)
--> 883 content, columns = _finalize_columns_and_data(arr, columns, dtype)
    884 return content, columns

File [~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:985](https://album.cloudina.org/user/kso-user/lab/tree/kso/notebooks/classify/~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py#line=984), in _finalize_columns_and_data(content, columns, dtype)
    982     columns = _validate_or_indexify_columns(contents, columns)
    983 except AssertionError as err:
    984     # GH#26429 do not raise user-facing AssertionError
--> 985     raise ValueError(err) from err
    987 if len(contents) and contents[0].dtype == np.object_:
    988     contents = _convert_object_array(contents, dtype=dtype)

ValueError: 8 columns passed, passed data had 9 columns