Closed ShrimpFather7 closed 11 months ago
@jannesgg
@ShrimpFather7 On my side, this was fixed 2 days ago as part of #327. Could you make sure you are on the dev branch and also do a git pull just to be safe?
Git pull removed that error. Now I'm getting this one however:
KeyError Traceback (most recent call last) File /usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py:3621, in Index.get_loc(self, key, method, tolerance) 3620 try: -> 3621 return self._engine.get_loc(casted_key) 3622 except KeyError as err:
File /usr/local/lib/python3.8/dist-packages/pandas/_libs/index.pyx:136, in pandas._libs.index.IndexEngine.get_loc()
File /usr/local/lib/python3.8/dist-packages/pandas/_libs/index.pyx:163, in pandas._libs.index.IndexEngine.get_loc()
File pandas/_libs/hashtable_class_helper.pxi:5198, in pandas._libs.hashtable.PyObjectHashTable.get_item()
File pandas/_libs/hashtable_class_helper.pxi:5206, in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'frame_path'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
File /usr/src/app/kso-dev/kso_utils/project.py:1022, in ProjectProcessor.generate_custom_frames.
File /usr/src/app/kso-dev/kso_utils/zooniverse_utils.py:1724, in modify_frames(project, frames_to_upload_df, species_i, modification_details) 1721 mod_frames_folder = project.output_path + mod_frames_folder 1723 # Specify the path of the modified frames -> 1724 frames_to_upload_df["modif_frame_path"] = frames_to_upload_df["frame_path"].apply( 1725 lambda x: str(Path(mod_frames_folder, Path(x).name)), 1 1726 ) 1728 # Remove existing modified clips 1729 if os.path.exists(mod_frames_folder):
File /usr/local/lib/python3.8/dist-packages/pandas/core/frame.py:3506, in DataFrame.getitem(self, key) 3504 if self.columns.nlevels > 1: 3505 return self._getitem_multilevel(key) -> 3506 indexer = self.columns.get_loc(key) 3507 if is_integer(indexer): 3508 indexer = [indexer]
File /usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py:3623, in Index.get_loc(self, key, method, tolerance) 3621 return self._engine.get_loc(casted_key) 3622 except KeyError as err: -> 3623 raise KeyError(key) from err 3624 except TypeError: 3625 # If we have a listlike key, _check_indexing_error will raise 3626 # InvalidIndexError. Otherwise we fall through and re-raise 3627 # the TypeError. 3628 self._check_indexing_error(key)
KeyError: 'frame_path'
@ShrimpFather7 Could you try running it again. I think the files should have an mp4 extension but keep the original mpg extension for some reason. I have renamed them manually to test this. Please have a look and see if this helps.
@jannesgg When running through the tutorial again I've been running through issues with compressing the frames. I get the following issue:
File
I'm not sure if this is related to the issue you were talking about or if it's a new one. But this has gotten in the way of me reaching the "Upload frames to zooniverse" cell. When I run that cell however, I reach this error:
IntCastingNaNError Traceback (most recent call last) Cell In[18], line 1 ----> 1 pp.upload_zoo_subjects("frame")
File /usr/src/app/kso-dev/kso_utils/project.py:781, in ProjectProcessor.upload_zoo_subjects(self, subject_type) 778 logging.info(f"Clips temporarily stored locally has been removed") 780 elif subject_type == "frame": --> 781 upload_df = zoo_utils.set_zoo_frame_metadata( 782 project=self.project, 783 db_connection=self.db_connection, 784 df=self.generated_frames, 785 species_list=self.species_of_interest, 786 csv_paths=self.csv_paths, 787 ) 788 zoo_utils.upload_frames_to_zooniverse( 789 project=self.project, 790 upload_to_zoo=upload_df, 791 species_list=self.species_of_interest, 792 ) 794 else:
File /usr/src/app/kso-dev/kso_utils/zooniverse_utils.py:1826, in set_zoo_frame_metadata(project, db_connection, df, species_list, csv_paths) 1824 # Set project-specific metadata 1825 if project.Zooniverse_number == 9747: -> 1826 df = add_db_info_to_df( 1827 project, db_connection, csv_paths, df, "sites", "id, siteName" 1828 ) 1829 upload_to_zoo = df[ 1830 [ 1831 "frame_path", (...) 1837 ] 1838 ] 1840 elif project_name == "SGU":
File /usr/src/app/kso-dev/kso_utils/db_utils.py:515, in add_db_info_to_df(project, conn, csv_paths, df, table_name, cols_interest) 513 # Ensure id columns that are going to be used to merge are int 514 if "id" in left_on_col: --> 515 df[left_on_col] = df[left_on_col].astype(float).astype(int) 517 # Combine the original and sqldf dfs 518 comb_df = pd.merge( 519 df, sql_df, how="left", left_on=left_on_col, right_on=right_on_col 520 )
File /usr/local/lib/python3.8/dist-packages/pandas/core/generic.py:5920, in NDFrame.astype(self, dtype, copy, errors) 5913 results = [ 5914 self.iloc[:, i].astype(dtype, copy=copy) 5915 for i in range(len(self.columns)) 5916 ] 5918 else: 5919 # else, only a single dtype is given -> 5920 new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors) 5921 return self._constructor(new_data).finalize(self, method="astype") 5923 # GH 33113: handle empty frame or series
File /usr/local/lib/python3.8/dist-packages/pandas/core/internals/managers.py:419, in BaseBlockManager.astype(self, dtype, copy, errors) 418 def astype(self: T, dtype, copy: bool = False, errors: str = "raise") -> T: --> 419 return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
File /usr/local/lib/python3.8/dist-packages/pandas/core/internals/managers.py:304, in BaseBlockManager.apply(self, f, align_keys, ignore_failures, kwargs) 302 applied = b.apply(f, kwargs) 303 else: --> 304 applied = getattr(b, f)(**kwargs) 305 except (TypeError, NotImplementedError): 306 if not ignore_failures:
File /usr/local/lib/python3.8/dist-packages/pandas/core/internals/blocks.py:582, in Block.astype(self, dtype, copy, errors) 564 """ 565 Coerce to the new dtype. 566 (...) 578 Block 579 """ 580 values = self.values --> 582 new_values = astype_array_safe(values, dtype, copy=copy, errors=errors) 584 new_values = maybe_coerce_values(new_values) 585 newb = self.make_block(new_values)
File /usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/cast.py:1292, in astype_array_safe(values, dtype, copy, errors) 1289 dtype = dtype.numpy_dtype 1291 try: -> 1292 new_values = astype_array(values, dtype, copy=copy) 1293 except (ValueError, TypeError): 1294 # e.g. astype_nansafe can fail on object-dtype of strings 1295 # trying to convert to float 1296 if errors == "ignore":
File /usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/cast.py:1237, in astype_array(values, dtype, copy) 1234 values = values.astype(dtype, copy=copy) 1236 else: -> 1237 values = astype_nansafe(values, dtype, copy=copy) 1239 # in pandas we don't store numpy str dtypes, so convert to object 1240 if isinstance(dtype, np.dtype) and issubclass(values.dtype.type, str):
File /usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/cast.py:1148, in astype_nansafe(arr, dtype, copy, skipna) 1145 raise TypeError(f"cannot astype a timedelta from [{arr.dtype}] to [{dtype}]") 1147 elif np.issubdtype(arr.dtype, np.floating) and np.issubdtype(dtype, np.integer): -> 1148 return astype_float_to_int_nansafe(arr, dtype, copy) 1150 elif is_object_dtype(arr.dtype): 1151 1152 # work around NumPy brokenness, #1987 1153 if np.issubdtype(dtype.type, np.integer):
File /usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/cast.py:1193, in astype_float_to_int_nansafe(values, dtype, copy) 1189 """ 1190 astype with a check preventing converting NaN to an meaningless integer value. 1191 """ 1192 if not np.isfinite(values).all(): -> 1193 raise IntCastingNaNError( 1194 "Cannot convert non-finite values (NA or inf) to integer" 1195 ) 1196 return values.astype(dtype, copy=copy)
IntCastingNaNError: Cannot convert non-finite values (NA or inf) to integer
@ShrimpFather7 This issue is probably due to the fact that there is a quote in the folder name "modified_Dead man's" which creates a problem for the string creation. I will see if I can solve this somehow and update here if I find a solution.
@jannesgg The first error was resolved by changing the species name, thanks! Hope it goes well with the other issue :)
@ShrimpFather7 I have made a quick fix now, please try git pull and let me know how it goes.
Tried a git pull and ran in kso-dev. Still seems like the same error.
IntCastingNaNError Traceback (most recent call last) Cell In[13], line 1 ----> 1 pp.upload_zoo_subjects("frame")
File /usr/src/app/kso-dev/kso_utils/project.py:781, in ProjectProcessor.upload_zoo_subjects(self, subject_type) 778 logging.info(f"Clips temporarily stored locally has been removed") 780 elif subject_type == "frame": --> 781 upload_df = zoo_utils.set_zoo_frame_metadata( 782 project=self.project, 783 db_connection=self.db_connection, 784 df=self.generated_frames, 785 species_list=self.species_of_interest, 786 csv_paths=self.csv_paths, 787 ) 788 zoo_utils.upload_frames_to_zooniverse( 789 project=self.project, 790 upload_to_zoo=upload_df, 791 species_list=self.species_of_interest, 792 ) 794 else:
File /usr/src/app/kso-dev/kso_utils/zooniverse_utils.py:1826, in set_zoo_frame_metadata(project, db_connection, df, species_list, csv_paths) 1824 # Set project-specific metadata 1825 if project.Zooniverse_number == 9747: -> 1826 df = add_db_info_to_df( 1827 project, db_connection, csv_paths, df, "sites", "id, siteName" 1828 ) 1829 upload_to_zoo = df[ 1830 [ 1831 "frame_path", (...) 1837 ] 1838 ] 1840 elif project_name == "SGU":
File /usr/src/app/kso-dev/kso_utils/db_utils.py:515, in add_db_info_to_df(project, conn, csv_paths, df, table_name, cols_interest) 513 # Ensure id columns that are going to be used to merge are int 514 if "id" in left_on_col: --> 515 df[left_on_col] = df[left_on_col].astype(float).astype(int) 517 # Combine the original and sqldf dfs 518 comb_df = pd.merge( 519 df, sql_df, how="left", left_on=left_on_col, right_on=right_on_col 520 )
File /usr/local/lib/python3.8/dist-packages/pandas/core/generic.py:5920, in NDFrame.astype(self, dtype, copy, errors) 5913 results = [ 5914 self.iloc[:, i].astype(dtype, copy=copy) 5915 for i in range(len(self.columns)) 5916 ] 5918 else: 5919 # else, only a single dtype is given -> 5920 new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors) 5921 return self._constructor(new_data).finalize(self, method="astype") 5923 # GH 33113: handle empty frame or series
File /usr/local/lib/python3.8/dist-packages/pandas/core/internals/managers.py:419, in BaseBlockManager.astype(self, dtype, copy, errors) 418 def astype(self: T, dtype, copy: bool = False, errors: str = "raise") -> T: --> 419 return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
File /usr/local/lib/python3.8/dist-packages/pandas/core/internals/managers.py:304, in BaseBlockManager.apply(self, f, align_keys, ignore_failures, kwargs) 302 applied = b.apply(f, kwargs) 303 else: --> 304 applied = getattr(b, f)(**kwargs) 305 except (TypeError, NotImplementedError): 306 if not ignore_failures:
File /usr/local/lib/python3.8/dist-packages/pandas/core/internals/blocks.py:582, in Block.astype(self, dtype, copy, errors) 564 """ 565 Coerce to the new dtype. 566 (...) 578 Block 579 """ 580 values = self.values --> 582 new_values = astype_array_safe(values, dtype, copy=copy, errors=errors) 584 new_values = maybe_coerce_values(new_values) 585 newb = self.make_block(new_values)
File /usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/cast.py:1292, in astype_array_safe(values, dtype, copy, errors) 1289 dtype = dtype.numpy_dtype 1291 try: -> 1292 new_values = astype_array(values, dtype, copy=copy) 1293 except (ValueError, TypeError): 1294 # e.g. astype_nansafe can fail on object-dtype of strings 1295 # trying to convert to float 1296 if errors == "ignore":
File /usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/cast.py:1237, in astype_array(values, dtype, copy) 1234 values = values.astype(dtype, copy=copy) 1236 else: -> 1237 values = astype_nansafe(values, dtype, copy=copy) 1239 # in pandas we don't store numpy str dtypes, so convert to object 1240 if isinstance(dtype, np.dtype) and issubclass(values.dtype.type, str):
File /usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/cast.py:1148, in astype_nansafe(arr, dtype, copy, skipna) 1145 raise TypeError(f"cannot astype a timedelta from [{arr.dtype}] to [{dtype}]") 1147 elif np.issubdtype(arr.dtype, np.floating) and np.issubdtype(dtype, np.integer): -> 1148 return astype_float_to_int_nansafe(arr, dtype, copy) 1150 elif is_object_dtype(arr.dtype): 1151 1152 # work around NumPy brokenness, #1987 1153 if np.issubdtype(dtype.type, np.integer):
File /usr/local/lib/python3.8/dist-packages/pandas/core/dtypes/cast.py:1193, in astype_float_to_int_nansafe(values, dtype, copy) 1189 """ 1190 astype with a check preventing converting NaN to an meaningless integer value. 1191 """ 1192 if not np.isfinite(values).all(): -> 1193 raise IntCastingNaNError( 1194 "Cannot convert non-finite values (NA or inf) to integer" 1195 ) 1196 return values.astype(dtype, copy=copy)
IntCastingNaNError: Cannot convert non-finite values (NA or inf) to integer
See e-mail. Closing issue for now. Re-open if necessary.
Before submitting a bug report, please be aware that your issue must be reproducible with all of the following, otherwise it is non-actionable, and we can not help you:
git fetch && git status -uno
to check andgit pull
to update repoIf this is a custom dataset/training question you must include your
train*.jpg
,test*.jpg
andresults.png
figures, or we can not help you. You can generate these withutils.plot_results()
.🐛 Bug
A clear and concise description of what the bug is.
To Reproduce (REQUIRED)
Input:
Output:
Expected behavior
A clear and concise description of what you expected to happen.
Environment
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.