Notebooks to upload/download marine footage, connect to a citizen science project, train machine learning models and publish marine biological observations.
GNU General Public License v3.0
4
stars
12
forks
source link
Notebook 8 issue with processing frames for ML learning #295
Before submitting a bug report, please be aware that your issue must be reproducible with all of the following, otherwise it is non-actionable, and we can not help you:
Current repo: run git fetch && git status -uno to check and git pull to update repo
If this is a custom dataset/training question you must include your train*.jpg, test*.jpg and results.png figures, or we can not help you. You can generate these with utils.plot_results().
🐛 Bug
A clear and concise description of what the bug is.
To Reproduce (REQUIRED)
Input:
Test proportion: 0.2
# Run the preparation script
mlp.prepare_dataset(
agg_df=pp.aggregated_zoo_classifications,
out_path=output_folder.selected,
img_size=(720, 540),
perc_test=percentage_test.value,
)
Species chosen: Protanthea simplex (497 annotation)
Output:
IndexError Traceback (most recent call last)
File /usr/src/app/kso-dev/kso_utils/project.py:1137, in MLProjectProcessor.prepare_dataset.<locals>.on_button_clicked(b)
1135 self.species_of_interest = species_list.value
1136 # code for prepare dataset for machine learning
-> 1137 self.modules["yolo_utils"].frame_aggregation(
1138 project=self.project,
1139 server_connection=self.server_connection,
1140 db_connection=self.db_connection,
1141 out_path=out_path,
1142 perc_test=perc_test,
1143 class_list=self.species_of_interest,
1144 img_size=img_size,
1145 remove_nulls=remove_nulls,
1146 track_frames=track_frames,
1147 n_tracked_frames=n_tracked_frames,
1148 agg_df=agg_df,
1149 )
File /usr/src/app/kso-dev/kso_utils/yolo_utils.py:458, in frame_aggregation(project, server_connection, db_connection, out_path, perc_test, class_list, img_size, out_format, remove_nulls, track_frames, n_tracked_frames, agg_df)
456 # Add species_id to train_rows
457 if "species_id" not in train_rows.columns:
--> 458 train_rows["species_id"] = train_rows["label"].apply(
459 lambda x: species_df[species_df.commonName == x].id.values[0]
460 if x != "empty"
461 else "empty",
462 1,
463 )
464 train_rows.drop(columns=["label"], axis=1, inplace=True)
466 sp_id2mod_id = {
467 species_df[species_df.clean_label == species_list[i]].id.values[0]: i
468 for i in range(len(species_list))
469 }
File /usr/local/lib/python3.8/dist-packages/pandas/core/series.py:4430, in Series.apply(self, func, convert_dtype, args, **kwargs)
4320 def apply(
4321 self,
4322 func: AggFuncType,
(...)
4325 **kwargs,
4326 ) -> DataFrame | Series:
4327 """
4328 Invoke function on values of Series.
4329
(...)
4428 dtype: float64
4429 """
-> 4430 return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File /usr/local/lib/python3.8/dist-packages/pandas/core/apply.py:1082, in SeriesApply.apply(self)
1078 if isinstance(self.f, str):
1079 # if we are a string, try to dispatch
1080 return self.apply_str()
-> 1082 return self.apply_standard()
File /usr/local/lib/python3.8/dist-packages/pandas/core/apply.py:1137, in SeriesApply.apply_standard(self)
1131 values = obj.astype(object)._values
1132 # error: Argument 2 to "map_infer" has incompatible type
1133 # "Union[Callable[..., Any], str, List[Union[Callable[..., Any], str]],
1134 # Dict[Hashable, Union[Union[Callable[..., Any], str],
1135 # List[Union[Callable[..., Any], str]]]]]"; expected
1136 # "Callable[[Any], Any]"
-> 1137 mapped = lib.map_infer(
1138 values,
1139 f, # type: ignore[arg-type]
1140 convert=self.convert_dtype,
1141 )
1143 if len(mapped) and isinstance(mapped[0], ABCSeries):
1144 # GH#43986 Need to do list(mapped) in order to get treated as nested
1145 # See also GH#25959 regarding EA support
1146 return obj._constructor_expanddim(list(mapped), index=obj.index)
File /usr/local/lib/python3.8/dist-packages/pandas/_libs/lib.pyx:2870, in pandas._libs.lib.map_infer()
File /usr/src/app/kso-dev/kso_utils/yolo_utils.py:459, in frame_aggregation.<locals>.<lambda>(x)
456 # Add species_id to train_rows
457 if "species_id" not in train_rows.columns:
458 train_rows["species_id"] = train_rows["label"].apply(
--> 459 lambda x: species_df[species_df.commonName == x].id.values[0]
460 if x != "empty"
461 else "empty",
462 1,
463 )
464 train_rows.drop(columns=["label"], axis=1, inplace=True)
466 sp_id2mod_id = {
467 species_df[species_df.clean_label == species_list[i]].id.values[0]: i
468 for i in range(len(species_list))
469 }
IndexError: index 0 is out of bounds for axis 0 with size 0
Expected behavior
Additional context
Might it be the
File /usr/src/app/kso-dev/kso_utils/yolo_utils.py:459, in frame_aggregation.<locals>.<lambda>(x)
456 # Add species_id to train_rows
457 if "species_id" not in train_rows.columns:
458 train_rows["species_id"] = train_rows["label"].apply(
--> 459 lambda x: species_df[species_df.commonName == x].id.values[0]
460 if x != "empty"
461 else "empty",
462 1,
463 )
Before submitting a bug report, please be aware that your issue must be reproducible with all of the following, otherwise it is non-actionable, and we can not help you:
git fetch && git status -uno
to check andgit pull
to update repoIf this is a custom dataset/training question you must include your
train*.jpg
,test*.jpg
andresults.png
figures, or we can not help you. You can generate these withutils.plot_results()
.🐛 Bug
A clear and concise description of what the bug is.
To Reproduce (REQUIRED)
Input: Test proportion: 0.2
Species chosen: Protanthea simplex (497 annotation) Output:
Expected behavior
Additional context
Might it be the
that is the issue here?