LorenFrankLab / spyglass

Neuroscience data analysis framework for reproducible research built by Loren Frank Lab at UCSF
https://lorenfranklab.github.io/spyglass/
MIT License
92 stars 42 forks source link

SQL syntax issue with "pose_eval_result" in DLC pipeline #1146

Closed emreybroyles closed 2 weeks ago

emreybroyles commented 2 weeks ago

Describe the bug from chat gpt: The query syntax is incorrect at the section where it's handling the 'pose_eval_result' field, which contains a Python dictionary-like structure ({\'greenLED\': 11.925503154100332, \'redLED_C\': 2.5800206935683057}). SQL does not directly support dictionary structures like this, so it's likely that the issue stems from attempting to pass this kind of complex data type into an SQL query without proper handling or conversion.

To Reproduce Steps to reproduce the behavior:

sgp.DLCPosVideo().populate(dlc_key) where dlc_key = {'nwb_file_name': 'SC100220231007_.nwb', 'epoch': 11, 'video_file_num': 16, 'project_name': 'sideSleep_LED', 'dlc_model_name': 'sideSleep_LED_tutorial_00', 'dlc_model_params_name': 'default', 'dlc_centroid_params_name': 'default', 'dlc_si_cohort_centroid': 'green_red_led', 'dlc_si_cohort_orientation': 'green_red_led', 'dlc_orientation_params_name': 'default'}

  1. This error is in Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/21_DLC_sideSleep_LED.ipynb
  2. Scroll down to sgp.DLCPosVideo().populate(dlc_key)
  3. See error in cell 58
Error Stack ```python --------------------------------------------------------------------------- QuerySyntaxError Traceback (most recent call last) Cell In[58], line 1 ----> 1 sgp.DLCPosVideo().populate(dlc_key) File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/spyglass/utils/dj_mixin.py:676](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/spyglass/utils/dj_mixin.py#line=675), in SpyglassMixin.populate(self, *restrictions, **kwargs) 674 processes = kwargs.pop("processes", 1) 675 if processes == 1 or not self._parallel_make: --> 676 return super().populate(*restrictions, **kwargs) 678 # If parallel in both make and populate, use non-daemon processes 679 # Get keys to populate 680 keys = (self._jobs_to_do(restrictions) - self.target).fetch( 681 "KEY", limit=kwargs.get("limit", None) 682 ) File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/autopopulate.py:241](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/autopopulate.py#line=240), in AutoPopulate.populate(self, suppress_errors, return_exception_objects, reserve_jobs, order, limit, max_calls, display_progress, processes, make_kwargs, *restrictions) 237 if processes == 1: 238 for key in ( 239 tqdm(keys, desc=self.__class__.__name__) if display_progress else keys 240 ): --> 241 error = self._populate1(key, jobs, **populate_kwargs) 242 if error is not None: 243 error_list.append(error) File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/autopopulate.py:292](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/autopopulate.py#line=291), in AutoPopulate._populate1(self, key, jobs, suppress_errors, return_exception_objects, make_kwargs) 290 self.__class__._allow_insert = True 291 try: --> 292 make(dict(key), **(make_kwargs or {})) 293 except (KeyboardInterrupt, SystemExit, Exception) as error: 294 try: File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/spyglass/position/v1/position_dlc_selection.py:336](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/spyglass/position/v1/position_dlc_selection.py#line=335), in DLCPosVideo.make(self, key) 334 crop = pose_estimation_params["cropping"] 335 print("Loading position data...") --> 336 position_info_df = ( 337 DLCPosV1() 338 & { 339 "nwb_file_name": key["nwb_file_name"], 340 "epoch": epoch, 341 "dlc_si_cohort_centroid": key["dlc_si_cohort_centroid"], 342 "dlc_centroid_params_name": key["dlc_centroid_params_name"], 343 "dlc_si_cohort_orientation": key["dlc_si_cohort_orientation"], 344 "dlc_orientation_params_name": key[ 345 "dlc_orientation_params_name" 346 ], 347 } 348 ).fetch1_dataframe() 349 pose_estimation_df = pd.concat( 350 { 351 bodypart: ( (...) 361 axis=1, 362 ) 363 assert len(pose_estimation_df) == len(position_info_df), ( 364 f"length of pose_estimation_df: {len(pose_estimation_df)} " 365 f"does not match the length of position_info_df: {len(position_info_df)}." 366 ) File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/spyglass/position/v1/position_dlc_selection.py:146](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/spyglass/position/v1/position_dlc_selection.py#line=145), in DLCPosV1.fetch1_dataframe(self) 145 def fetch1_dataframe(self): --> 146 nwb_data = self.fetch_nwb()[0] 147 index = pd.Index( 148 np.asarray(nwb_data["position"].get_spatial_series().timestamps), 149 name="time", 150 ) 151 COLUMNS = [ 152 "video_frame_ind", 153 "position_x", (...) 158 "speed", 159 ] File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/spyglass/utils/dj_mixin.py:210](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/spyglass/utils/dj_mixin.py#line=209), in SpyglassMixin.fetch_nwb(self, *attrs, **kwargs) 194 self._export_table.File.insert( 195 [ 196 {"export_id": self.export_id, tbl_pk: fname} (...) 199 skip_duplicates=True, 200 ) 201 self._export_table.Table.insert1( 202 dict( 203 export_id=self.export_id, (...) 207 skip_duplicates=True, 208 ) --> 210 return fetch_nwb(self, self._nwb_table_tuple, *attrs, **kwargs) File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/spyglass/utils/dj_helper_fn.py:265](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/spyglass/utils/dj_helper_fn.py#line=264), in fetch_nwb(query_expression, nwb_master, *attrs, **kwargs) 263 rec_dict["nwb2load_filepath"] = file_path 264 continue --> 265 rec_dict["nwb2load_filepath"] = (query_table & rec_dict).fetch1( 266 "nwb2load_filepath" 267 ) 269 if not rec_dicts or not np.any( 270 ["object_id" in key for key in rec_dicts[0]] 271 ): 272 return rec_dicts File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/fetch.py:348](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/fetch.py#line=347), in Fetch1.__call__(self, squeeze, download_path, *attrs) 346 else: # fetch some attributes, return as tuple 347 attributes = [a for a in attrs if not is_key(a)] --> 348 result = self._expression.proj(*attributes).fetch( 349 squeeze=squeeze, download_path=download_path, format="array" 350 ) 351 if len(result) != 1: 352 raise DataJointError( 353 "fetch1 should only return one tuple. %d tuples found" % len(result) 354 ) File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/fetch.py:258](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/fetch.py#line=257), in Fetch.__call__(self, offset, limit, order_by, format, as_dict, squeeze, download_path, *attrs) 256 ret = return_values[0] if len(attrs) == 1 else return_values 257 else: # fetch all attributes as a numpy.record_array or pandas.DataFrame --> 258 cur = self._expression.cursor( 259 as_dict=as_dict, limit=limit, offset=offset, order_by=order_by 260 ) 261 heading = self._expression.heading 262 if as_dict: File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/expression.py:633](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/expression.py#line=632), in QueryExpression.cursor(self, offset, limit, order_by, as_dict) 631 sql += " LIMIT %d" % limit + (" OFFSET %d" % offset if offset else "") 632 logger.debug(sql) --> 633 return self.connection.query(sql, as_dict=as_dict) File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/connection.py:340](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/connection.py#line=339), in Connection.query(self, query, args, as_dict, suppress_warnings, reconnect) 338 cursor = self._conn.cursor(cursor=cursor_class) 339 try: --> 340 self._execute_query(cursor, query, args, suppress_warnings) 341 except errors.LostConnectionError: 342 if not reconnect: File [~/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/connection.py:296](http://localhost:8886/lab/workspaces/auto-1/tree/Src/spyglass/notebooks/eb_notebooks/DLCNotebooks/anaconda3/envs/spyglass-dlc/lib/python3.9/site-packages/datajoint/connection.py#line=295), in Connection._execute_query(cursor, query, args, suppress_warnings) 294 cursor.execute(query, args) 295 except client.err.Error as err: --> 296 raise translate_query_error(err, query) QuerySyntaxError: ("You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''greenLED': 11.925503154100332, 'redLED_C': 2.5800206935683057}))' at line 1", 'SELECT `dlc_si_cohort_centroid`,`nwb_file_name`,`epoch`,`video_file_num`,`project_name`,`dlc_model_name`,`dlc_model_params_name`,`dlc_centroid_params_name`,`dlc_si_cohort_orientation`,`dlc_orientation_params_name`,`analysis_file_name`,`nwb2load_filepath` FROM (SELECT `dlc_si_cohort_centroid`,`nwb_file_name`,`epoch`,`video_file_num`,`project_name`,`dlc_model_name`,`dlc_model_params_name`,`dlc_centroid_params_name`,`dlc_si_cohort_orientation`,`dlc_orientation_params_name`,`analysis_file_name`,`position_object_id`,`orientation_object_id`,`velocity_object_id`,`pose_eval_result` FROM `position_v1_dlc_selection`.`__d_l_c_pos_v1` WHERE ( (`nwb_file_name`="SC100220231007_.nwb")AND(`epoch`=11)AND(`dlc_si_cohort_centroid`="green_red_led")AND(`dlc_centroid_params_name`="default")AND(`dlc_si_cohort_orientation`="green_red_led")AND(`dlc_orientation_params_name`="default"))) as `$f2` NATURAL JOIN (SELECT `analysis_file_name`,`analysis_file_abs_path` as `nwb2load_filepath` FROM `common_nwbfile`.`analysis_nwbfile`) as `$f3` WHERE ( (`dlc_si_cohort_centroid`="green_red_led")AND(`nwb_file_name`="SC100220231007_.nwb")AND(`epoch`=11)AND(`video_file_num`=16)AND(`project_name`="sideSleep_LED")AND(`dlc_model_name`="sideSleep_LED_tutorial_00")AND(`dlc_model_params_name`="default")AND(`dlc_centroid_params_name`="default")AND(`dlc_si_cohort_orientation`="green_red_led")AND(`dlc_orientation_params_name`="default")AND(`analysis_file_name`="SC100220231007_G8M0YFO4YW.nwb")AND(`position_object_id`="778db57d-45c3-427f-b967-cd1806b07e59")AND(`orientation_object_id`="984b4e5b-6e92-40b9-9cf2-da11955aa1fd")AND(`velocity_object_id`="47426775-d9d3-47c3-befc-c3aebbd240bf")AND(`pose_eval_result`={\'greenLED\': 11.925503154100332, \'redLED_C\': 2.5800206935683057}))') ```

I'd like this commands to work so I can create a video with my centroid and orientation overlaid for the bodyparts red_LED_C and green_LED as a model quality assurance step.

CBroz1 commented 2 weeks ago

I ran the populate above with the required key, and hit a different error related to cv2. I made the edit shown in #1150 and was able to process this key. Please take a look at the new entries and let me know if they look right. If so, you might need to update spyglass or dependecies