AllenInstitute / visual_behavior_analysis

Python package for analyzing behavioral data for Brain Observatory: Visual Behavior
Other
22 stars 6 forks source link

File not found during Circle CI tests #522

Closed nickponvert closed 4 years ago

nickponvert commented 5 years ago

When running tests for PR #521 I found that most (if not all) of the test failures are related to a specific file that can't be found.

______________ ERROR at setup of test_analysis_folder[<lambda>0] _______________

ophys_data_dir = '/tmp/pytest-of-circleci/pytest-0/ophys_dataset_010'

    @pytest.fixture
    def ophys_dataset(ophys_data_dir):
>       return VisualBehaviorOphysDataset(12345678, ophys_data_dir)

tests/ophys/dataset/test_visual_behavior_ophys_dataset.py:190: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
visual_behavior/ophys/dataset/visual_behavior_ophys_dataset.py:59: in __init__
    self.roi_metrics = self.get_roi_metrics()
visual_behavior/ophys/dataset/visual_behavior_ophys_dataset.py:258: in get_roi_metrics
    self._roi_metrics = pd.read_hdf(os.path.join(self.analysis_dir, 'roi_metrics.h5'), key='df')
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

path_or_buf = '/tmp/pytest-of-circleci/pytest-0/ophys_dataset_010/12345678_analysis/roi_metrics.h5'
key = 'df', mode = 'r', kwargs = {}, exists = False

    def read_hdf(path_or_buf, key=None, mode='r', **kwargs):
        """
        Read from the store, close it if we opened it.

        Retrieve pandas object stored in file, optionally based on where
        criteria

        Parameters
        ----------
        path_or_buf : string, buffer or path object
            Path to the file to open, or an open :class:`pandas.HDFStore` object.
            Supports any object implementing the ``__fspath__`` protocol.
            This includes :class:`pathlib.Path` and py._path.local.LocalPath
            objects.

            .. versionadded:: 0.19.0 support for pathlib, py.path.
            .. versionadded:: 0.21.0 support for __fspath__ proptocol.

        key : object, optional
            The group identifier in the store. Can be omitted if the HDF file
            contains a single pandas object.
        mode : {'r', 'r+', 'a'}, optional
            Mode to use when opening the file. Ignored if path_or_buf is a
            :class:`pandas.HDFStore`. Default is 'r'.
        where : list, optional
            A list of Term (or convertible) objects.
        start : int, optional
            Row number to start selection.
        stop  : int, optional
            Row number to stop selection.
        columns : list, optional
            A list of columns names to return.
        iterator : bool, optional
            Return an iterator object.
        chunksize : int, optional
            Number of rows to include in an iteration when using an iterator.
        errors : str, default 'strict'
            Specifies how encoding and decoding errors are to be handled.
            See the errors argument for :func:`open` for a full list
            of options.
        **kwargs
            Additional keyword arguments passed to HDFStore.

        Returns
        -------
        item : object
            The selected object. Return type depends on the object stored.

        See Also
        --------
        pandas.DataFrame.to_hdf : write a HDF file from a DataFrame
        pandas.HDFStore : low-level access to HDF files

        Examples
        --------
        >>> df = pd.DataFrame([[1, 1.0, 'a']], columns=['x', 'y', 'z'])
        >>> df.to_hdf('./store.h5', 'data')
        >>> reread = pd.read_hdf('./store.h5')
        """

        if mode not in ['r', 'r+', 'a']:
            raise ValueError('mode {0} is not allowed while performing a read. '
                             'Allowed modes are r, r+ and a.'.format(mode))
        # grab the scope
        if 'where' in kwargs:
            kwargs['where'] = _ensure_term(kwargs['where'], scope_level=1)

        if isinstance(path_or_buf, HDFStore):
            if not path_or_buf.is_open:
                raise IOError('The HDFStore must be open for reading.')

            store = path_or_buf
            auto_close = False
        else:
            path_or_buf = _stringify_path(path_or_buf)
            if not isinstance(path_or_buf, string_types):
                raise NotImplementedError('Support for generic buffers has not '
                                          'been implemented.')
            try:
                exists = os.path.exists(path_or_buf)

            # if filepath is too long
            except (TypeError, ValueError):
                exists = False

            if not exists:
                raise compat.FileNotFoundError(
>                   'File %s does not exist' % path_or_buf)
E               FileNotFoundError: File /tmp/pytest-of-circleci/pytest-0/ophys_dataset_010/12345678_analysis/roi_metrics.h5 does not exist

.tox/py36/lib/python3.6/site-packages/pandas/io/pytables.py:371: FileNotFoundError
nickponvert commented 5 years ago

Trying to follow the thread here.

Here is the first line that fails: https://github.com/AllenInstitute/visual_behavior_analysis/blob/3a9994aa4bcf8cac6834e24e4cb6193d9edaa39f/tests/ophys/dataset/test_visual_behavior_ophys_dataset.py#L190

The ophys_data_dir is a test fixture that we generate here, saving a bunch of stuff to h5 files: https://github.com/AllenInstitute/visual_behavior_analysis/blob/3a9994aa4bcf8cac6834e24e4cb6193d9edaa39f/tests/ophys/dataset/test_visual_behavior_ophys_dataset.py#L125

When we initialize a VisualBehaviorOphysDataset object, it now requires a file called roi_masks.h5

https://github.com/AllenInstitute/visual_behavior_analysis/blob/3a9994aa4bcf8cac6834e24e4cb6193d9edaa39f/visual_behavior/ophys/dataset/visual_behavior_ophys_dataset.py#L59

https://github.com/AllenInstitute/visual_behavior_analysis/blob/3a9994aa4bcf8cac6834e24e4cb6193d9edaa39f/visual_behavior/ophys/dataset/visual_behavior_ophys_dataset.py#L257

We likely need to generate the roi_masks.h5 file in the ophys_data_dir test fixture to solve this.

nickponvert commented 5 years ago

Actually I think I'm thinking about this wrong. The failing test shows that this version of the VBOD object is not compatible with data that does not include roi_masks. If early data did not include that file, and we want to maintain backwards compatibility, we should tell VBOD what to do when it isn't found.