ocean-data-factory-sweden / kso

Notebooks to upload/download marine footage, connect to a citizen science project, train machine learning models and publish marine biological observations.
GNU General Public License v3.0
4 stars 12 forks source link

Tut 3 upload clip issues #293

Closed Bergylta closed 9 months ago

Bergylta commented 9 months ago

🐛 Bug

Another day another set of bugs, this time: 1:The tutorial seems to ignore the clip modifications, clips remain the same at none->high compress or blur sensitive info. 2: The created clips end up in the project movie folder and not tmp_dir (but in a sub-folder called tmp_dir) 3: Upload error message (more on that below)

To Reproduce (REQUIRED)

Input:

pp.upload_zoo_subjects("clip")

Output:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[49], line 1
----> 1 pp.upload_zoo_subjects("clip")

File /usr/src/app/kso-dev/kso_utils/project.py:650, in ProjectProcessor.upload_zoo_subjects(self, subject_type)
    641 """
    642 This function uploads clips or frames to Zooniverse, depending on the subject_type argument
    643 
   (...)
    646 :type subject_type: str
    647 """
    648 if subject_type == "clip":
    649     # Add declaration to avoid pylint error
--> 650     upload_df, sitename, created_on = zoo_utils.set_zoo_clip_metadata(
    651         project=self.project,
    652         generated_clipsdf=self.generated_clips,
    653         sitesdf=self.local_sites_csv,
    654         moviesdf=self.local_movies_csv,
    655     )
    656     zoo_utils.upload_clips_to_zooniverse(
    657         project=self.project,
    658         upload_to_zoo=upload_df,
    659         sitename=sitename,
    660         created_on=created_on,
    661     )
    662     # Clean up subjects after upload

File /usr/src/app/kso-dev/kso_utils/zooniverse_utils.py:1387, in set_zoo_clip_metadata(project, generated_clipsdf, sitesdf, moviesdf)
   1377 sitesdf = sitesdf.rename(
   1378     columns={
   1379         "decimalLatitude": "#decimalLatitude",
   (...)
   1383     }
   1384 )
   1386 # Select only relevant columns
-> 1387 sitesdf = sitesdf[
   1388     [
   1389         "siteName",
   1390         "#decimalLatitude",
   1391         "#decimalLongitude",
   1392         "#geodeticDatum",
   1393         "#countryCode",
   1394     ]
   1395 ]
   1397 # Include site info to the df
   1398 upload_to_zoo = upload_to_zoo.merge(
   1399     sitesdf, left_on="#siteName", right_on="siteName"
   1400 )

File /usr/local/lib/python3.8/dist-packages/pandas/core/frame.py:3512, in DataFrame.__getitem__(self, key)
   3510     if is_iterator(key):
   3511         key = list(key)
-> 3512     indexer = self.columns._get_indexer_strict(key, "columns")[1]
   3514 # take() does not accept boolean indexers
   3515 if getattr(indexer, "dtype", None) == bool:

File /usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py:5782, in Index._get_indexer_strict(self, key, axis_name)
   5779 else:
   5780     keyarr, indexer, new_indexer = self._reindex_non_unique(keyarr)
-> 5782 self._raise_if_missing(keyarr, indexer, axis_name)
   5784 keyarr = self.take(indexer)
   5785 if isinstance(key, Index):
   5786     # GH 42790 - Preserve name from an Index

File /usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py:5845, in Index._raise_if_missing(self, key, indexer, axis_name)
   5842     raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   5844 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
-> 5845 raise KeyError(f"{not_found} not in index")

KeyError: "['siteName'] not in index"

Expected behavior

Environment

Additional context

How the clips end up image