LorenFrankLab / rec_to_nwb

Data Migration REC -> NWB 2.0 Service
Other
2 stars 8 forks source link

Duplicate timestamps in pos_online file seems to be a problem for generating nwb file #34

Closed jguides closed 2 years ago

jguides commented 2 years ago

I get a dimension mismatch error when trying to generate an nwb file with a recording that had a disconnect (J1620210609_.nwb; recording 19_h2 has the disconnect).

I believe this error occurs when trying to index position_tracking (length 202873) with a boolean the same length as ptp_systime (length 76007) in position_originator.py:

            position_tracking = (
                position_tracking
                .iloc[ptp_systime > pause_mid_time]
                .set_index(ptp_timestamps)) 

It looks like position_tracking corresponds to 20210609_J16_19_h2.1.pos_online.dat, and ptp_systime has data from .pos_cameraHWFrameCount.dat (via video_info). On line 107 of position_originator, the unique indices of position_tracking are used to define ptp_systime (via video_info):

video_info = video_info.loc[position_tracking.index.unique()]

It looks like there are duplicate indices in position_tracking in the later half of the recording (perhaps related to the disconnect?), and so ptp_systime ends up shorter than position_tracking.

@lfrank or @edeno, do you have thoughts about how to address this? Thanks in advance.

Full traceback:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_378752/3133151736.py in <module>
     46                               trodes_rec_export_args=trodes_rec_export_args)
     47 
---> 48     content = builder.build_nwb()
     49     print(content)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    230             process_mda_invalid_time=process_mda_invalid_time,
    231             process_pos_valid_time=process_pos_valid_time,
--> 232             process_pos_invalid_time=process_pos_invalid_time)
    233 
    234         logger.info('Done...\n')

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    245             logger.info('Date: {}'.format(date))
    246             nwb_builder = self.get_nwb_builder(date)
--> 247             content = nwb_builder.build()
    248             nwb_builder.write(content)
    249             if self.is_old_dataset:

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in build(self)
    369             self.associated_files_originator.make(nwb_content)
    370 
--> 371         self.position_originator.make(nwb_content)
    372 
    373         valid_map_dict = self.__build_corrupted_data_manager()

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in make(self, nwb_content)
     50             first_timestamps = []
     51             for series_id, (conversion, position_tracking_path) in enumerate(
---> 52                     zip(meters_per_pixels, position_tracking_paths)):
     53                 position_df = self.get_position_with_corrected_timestamps(
     54                     position_tracking_path)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in get_position_with_corrected_timestamps(position_tracking_path)
    124             ptp_timestamps = pd.Index(
    125                 ptp_systime[ptp_systime > pause_mid_time] /
--> 126                 NANOSECONDS_PER_SECOND,
    127                 name='time')
    128             position_tracking = (

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(self, key)
    929 
    930             maybe_callable = com.apply_if_callable(key, self.obj)
--> 931             return self._getitem_axis(maybe_callable, axis=axis)
    932 
    933     def _is_scalar_access(self, key: tuple):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1551         if com.is_bool_indexer(key):
   1552             self._validate_key(key, axis)
-> 1553             return self._getbool_axis(key, axis=axis)
   1554 
   1555         # a list of integers

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in _getbool_axis(self, key, axis)
    946         # caller is responsible for ensuring non-None axis
    947         labels = self.obj._get_axis(axis)
--> 948         key = check_bool_indexer(labels, key)
    949         inds = key.nonzero()[0]
    950         return self.obj._take_with_is_copy(inds, axis=axis)

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in check_bool_indexer(index, key)
   2399         # key may contain nan elements, check_array_indexer needs bool array
   2400         result = pd_array(result, dtype=bool)
-> 2401     return check_array_indexer(index, result)
   2402 
   2403 

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexers.py in check_array_indexer(array, indexer)
    560         if len(indexer) != len(array):
    561             raise IndexError(
--> 562                 f"Boolean index has wrong length: "
    563                 f"{len(indexer)} instead of {len(array)}"
    564             )

IndexError: Boolean index has wrong length: 76007 instead of 202873