YttriLab / B-SOID

Behavioral segmentation of open field in DeepLabCut, or B-SOID ("B-side"), is a pipeline that pairs unsupervised pattern recognition with supervised classification to achieve fast predictions of behaviors that are not predefined by users.
GNU General Public License v3.0
190 stars 54 forks source link

UMAP+HBDSCAN model Index error when preprocessing csv during training #10

Closed gongziyida closed 4 years ago

gongziyida commented 4 years ago

The following error occurred in the middle of training when preprocessing one of the csv files. In the print-out, 1/6 preprocessing was done before the error occurred. Do you have any direction on how should I fix the error? Thanks!

Traceback (most recent call last):
  File "run.py", line 6, in <module>
    bsoid_umap.main.build(TRAIN_FOLDERS)
  File "/ihome/nurban/zig9/B-SOID/bsoid_umap/main.py", line 28, in build
    nn_assignments = bsoid_umap.train.main(train_folders)
  File "/ihome/nurban/zig9/B-SOID/bsoid_umap/train.py", line 231, in main
    filenames, training_data, perc_rect = bsoid_umap.utils.likelihoodprocessing.main(train_folders)
  File "/ihome/nurban/zig9/B-SOID/bsoid_umap/utils/likelihoodprocessing.py", line 139, in main
    filenames, data, perc_rect = import_folders(folders)
  File "/ihome/nurban/zig9/B-SOID/bsoid_umap/utils/likelihoodprocessing.py", line 72, in import_folders
    curr_df_filt, perc_rect = adp_filt(curr_df)
  File "/ihome/nurban/zig9/B-SOID/bsoid_umap/utils/likelihoodprocessing.py", line 116, in adp_filt
    if rise_a[0][0] > 1:
IndexError: index 0 is out of bounds for axis 0 with size 0
runninghsus commented 4 years ago

Hi @gongziyida

Are you on slack? It might be easier to communicate via slack with respect to individual data issues.

https://join.slack.com/t/b-soid/shared_invite/zt-dksalgqu-Eix8ZVYYFVVFULUhMJfvlw

runninghsus commented 4 years ago

High-pass filter problem due to all around low likelihood in that file (array([162, 144, 101, 39, 31, 17, 12, 6, 2, 1]). The likelihood counts never rose up so cannot determine threshold. Solved with changing lines 72-76 in likelihoodprocessing.py to:

            try:
                curr_df_filt, perc_rect = adp_filt(curr_df)
                logging.info('Done preprocessing (x,y) from file {}, folder {}.'.format(j + 1, i + 1))
                rawdata_li.append(curr_df)
                perc_rect_li.append(perc_rect)
                data_li.append(curr_df_filt)
            except:
                pass