vocalpy / crowsetta

A tool to work with any format for annotating vocalizations
https://crowsetta.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
49 stars 3 forks source link

BUG: tests in test_formats/test_seq/test_generic.py failed #218

Closed rhine3 closed 1 year ago

rhine3 commented 1 year ago

Before submitting a bug, please make sure the issue hasn't been already addressed by searching through the past issues

Describe the bug While running tests with Pytest I found that 6 of the tests in test_formats/test_seq/test_generic.py failed on my machine. The short test summary info is below; longer test summaries are copied at the end of this issue.

FAILED test_formats/test_seq/test_generic.py::TestAnnot2DfFunction::test_annot2df_onset_offset_s_only - AttributeError: module 'crowsetta.formats.seq.generic' has no attribute 'annot2df'
FAILED test_formats/test_seq/test_generic.py::TestAnnot2DfFunction::test_annot2df_onset_offset_s_and_ind - AttributeError: module 'crowsetta.formats.seq.generic' has no attribute 'annot2df'
FAILED test_formats/test_seq/test_generic.py::TestAnnot2DfFunction::test_annot2df_onset_offset_sample_only - AttributeError: module 'crowsetta.formats.seq.generic' has no attribute 'annot2df'
FAILED test_formats/test_seq/test_generic.py::TestGenericSeqClass::test_annot2df_onset_offset_s_only - AttributeError: 'GenericSeq' object has no attribute 'to_df'
FAILED test_formats/test_seq/test_generic.py::TestGenericSeqClass::test_to_df_onset_offset_s_and_ind - AttributeError: 'GenericSeq' object has no attribute 'to_df'
FAILED test_formats/test_seq/test_generic.py::TestGenericSeqClass::test_to_df_onset_offset_sample_only - AttributeError: 'GenericSeq' object has no attribute 'to_df'

To Reproduce I cloned the GitHub repository on my local machine. I installed crowsetta in a Python 3.10.9 Anaconda environment and also conda installed pytest. I navigated to the tests directory and ran the command pytest *.

Expected behavior I expected the tests to pass! :)

Screenshots Errors are reproduced in text below instead

Desktop (please complete the following information):

Additional context I created this issue while performing the PyOpenSci review of this package. Full text of tests is reproduced below.

=================================================================================================================================== FAILURES ===================================================================================================================================
____________________________________________________________________________________________________________ TestAnnot2DfFunction.test_annot2df_onset_offset_s_only ____________________________________________________________________________________________________________

self = <tests.test_formats.test_seq.test_generic.TestAnnot2DfFunction object at 0x7f8049738bb0>
notmat_paths = [PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/cbins/gy6or6/032312/gy6...iew/crowsetta/tests/fixtures/../data_for_tests/cbins/gy6or6/032312/gy6or6_baseline_230312_0816.179.cbin.not.mat'), ...]
notmat_as_generic_seq_csv = PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/csv/notmat_gy6or6_032312.csv')

    def test_annot2df_onset_offset_s_only(self,
                                          notmat_paths,
                                          notmat_as_generic_seq_csv):
        """test whether `annot2df` works when
        the annotations have onsets and offsets specified in seconds only.
        To test this we use the 'notmat' format.
        """
        annot_list = [crowsetta.formats.seq.NotMat.from_file(notmat_path).to_annot()
                      for notmat_path in notmat_paths]
        # below, set basename to True so we can easily run tests on any system without
        # worrying about where audio files are relative to root of directory tree
>       df_created = crowsetta.formats.seq.generic.annot2df(annot_list,
                                                            basename=True)
E       AttributeError: module 'crowsetta.formats.seq.generic' has no attribute 'annot2df'

test_formats/test_seq/test_generic.py:101: AttributeError
__________________________________________________________________________________________________________ TestAnnot2DfFunction.test_annot2df_onset_offset_s_and_ind ___________________________________________________________________________________________________________

self = <tests.test_formats.test_seq.test_generic.TestAnnot2DfFunction object at 0x7f8049738ac0>, birdsong_rec_xml_file = PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/birdsongrec/Bird0/Annotation.xml')
birdsong_rec_wav_path = PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/birdsongrec/Bird0/Wave')
birdsongrec_as_generic_seq_csv = PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/csv/birdsongrec_Bird0_Annotation.csv')

    def test_annot2df_onset_offset_s_and_ind(self,
                                             birdsong_rec_xml_file,
                                             birdsong_rec_wav_path,
                                             birdsongrec_as_generic_seq_csv):
        """test whether `annot2df` works when
        the annotations have onsets and offsets specified in seconds and in samples.
        To test this we use the 'birdsong-recognition-dataset' format.
        """
        birdsongrec = crowsetta.formats.seq.BirdsongRec.from_file(annot_path=birdsong_rec_xml_file,
                                                                  wav_path=birdsong_rec_wav_path,
                                                                  concat_seqs_into_songs=True)
        annot_list = birdsongrec.to_annot()
        # below, set basename to True so we can easily run tests on any system without
        # worrying about where audio files are relative to root of directory tree
>       df_created = crowsetta.formats.seq.generic.annot2df(annot_list,
                                                            basename=True)
E       AttributeError: module 'crowsetta.formats.seq.generic' has no attribute 'annot2df'

test_formats/test_seq/test_generic.py:123: AttributeError
_________________________________________________________________________________________________________ TestAnnot2DfFunction.test_annot2df_onset_offset_sample_only __________________________________________________________________________________________________________

self = <tests.test_formats.test_seq.test_generic.TestAnnot2DfFunction object at 0x7f8049739f30>
kaggle_phn_paths = [PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/timit_kaggle/dr1-fvmh0/...'/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/timit_kaggle/dr1-fvmh0/si836.phn')]
timit_phn_as_generic_seq_csv = PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/csv/timit-dr1-fvmh0-phn.csv')

    def test_annot2df_onset_offset_sample_only(self,
                                               kaggle_phn_paths,
                                               timit_phn_as_generic_seq_csv):
        """test whether `annot2df` works when
        the annotations have onsets and offsets specified in seconds only.
        To test this we use the 'timit' format."""
        annot_list = [crowsetta.formats.seq.Timit.from_file(phn_path).to_annot()
                      for phn_path in kaggle_phn_paths]

        # below, set basename to True so we can easily run tests on any system without
        # worrying about where audio files are relative to root of directory tree
>       df_created = crowsetta.formats.seq.generic.annot2df(annot_list,
                                                            basename=True)
E       AttributeError: module 'crowsetta.formats.seq.generic' has no attribute 'annot2df'

test_formats/test_seq/test_generic.py:144: AttributeError
____________________________________________________________________________________________________________ TestGenericSeqClass.test_annot2df_onset_offset_s_only _____________________________________________________________________________________________________________

self = <tests.test_formats.test_seq.test_generic.TestGenericSeqClass object at 0x7f8049e0aa10>
notmat_paths = [PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/cbins/gy6or6/032312/gy6...iew/crowsetta/tests/fixtures/../data_for_tests/cbins/gy6or6/032312/gy6or6_baseline_230312_0816.179.cbin.not.mat'), ...]
notmat_as_generic_seq_csv = PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/csv/notmat_gy6or6_032312.csv')

    def test_annot2df_onset_offset_s_only(self,
                                          notmat_paths,
                                          notmat_as_generic_seq_csv):
        """test whether `annot2df` works when
        the annotations have onsets and offsets specified in seconds only.
        To test this we use the 'notmat' format.
        """
        annot_list = [crowsetta.formats.seq.NotMat.from_file(notmat_path).to_annot()
                      for notmat_path in notmat_paths]
        generic_seq = crowsetta.formats.seq.GenericSeq(annots=annot_list)
        # below, set basename to True so we can easily run tests on any system without
        # worrying about where audio files are relative to root of directory tree
>       df_created = generic_seq.to_df(basename=True)
E       AttributeError: 'GenericSeq' object has no attribute 'to_df'

test_formats/test_seq/test_generic.py:479: AttributeError
____________________________________________________________________________________________________________ TestGenericSeqClass.test_to_df_onset_offset_s_and_ind _____________________________________________________________________________________________________________

self = <tests.test_formats.test_seq.test_generic.TestGenericSeqClass object at 0x7f8049e0a890>, birdsong_rec_xml_file = PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/birdsongrec/Bird0/Annotation.xml')
birdsong_rec_wav_path = PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/birdsongrec/Bird0/Wave')
birdsongrec_as_generic_seq_csv = PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/csv/birdsongrec_Bird0_Annotation.csv')

    def test_to_df_onset_offset_s_and_ind(self,
                                          birdsong_rec_xml_file,
                                          birdsong_rec_wav_path,
                                          birdsongrec_as_generic_seq_csv):
        """test whether `GenericSeq.to_df` works when
        the annotations have onsets and offsets specified in seconds and in samples.
        To test this we use the 'birdsong-recognition-dataset' format.
        """
        birdsongrec = crowsetta.formats.seq.BirdsongRec.from_file(annot_path=birdsong_rec_xml_file,
                                                                  wav_path=birdsong_rec_wav_path,
                                                                  concat_seqs_into_songs=True)
        annot_list = birdsongrec.to_annot()
        generic_seq = crowsetta.formats.seq.GenericSeq(annots=annot_list)
        # below, set basename to True so we can easily run tests on any system without
        # worrying about where audio files are relative to root of directory tree
>       df_created = generic_seq.to_df(basename=True)
E       AttributeError: 'GenericSeq' object has no attribute 'to_df'

test_formats/test_seq/test_generic.py:501: AttributeError
___________________________________________________________________________________________________________ TestGenericSeqClass.test_to_df_onset_offset_sample_only ____________________________________________________________________________________________________________

self = <tests.test_formats.test_seq.test_generic.TestGenericSeqClass object at 0x7f8049e0a6b0>
kaggle_phn_paths = [PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/timit_kaggle/dr1-fvmh0/...'/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/timit_kaggle/dr1-fvmh0/si836.phn')]
timit_phn_as_generic_seq_csv = PosixPath('/Users/tessa/Downloads/crowsetta_review/crowsetta/tests/fixtures/../data_for_tests/csv/timit-dr1-fvmh0-phn.csv')

    def test_to_df_onset_offset_sample_only(self,
                                            kaggle_phn_paths,
                                            timit_phn_as_generic_seq_csv):
        """test whether `annot2df` works when
        the annotations have onsets and offsets specified in seconds only.
        To test this we use the 'timit' format."""
        annot_list = [crowsetta.formats.seq.Timit.from_file(phn_path).to_annot()
                      for phn_path in kaggle_phn_paths]
        generic_seq = crowsetta.formats.seq.GenericSeq(annots=annot_list)
        # below, set basename to True so we can easily run tests on any system without
        # worrying about where audio files are relative to root of directory tree
>       df_created = generic_seq.to_df(basename=True)
E       AttributeError: 'GenericSeq' object has no attribute 'to_df'

test_formats/test_seq/test_generic.py:521: AttributeError
=========================================================================================================================== short test summary info ============================================================================================================================
FAILED test_formats/test_seq/test_generic.py::TestAnnot2DfFunction::test_annot2df_onset_offset_s_only - AttributeError: module 'crowsetta.formats.seq.generic' has no attribute 'annot2df'
FAILED test_formats/test_seq/test_generic.py::TestAnnot2DfFunction::test_annot2df_onset_offset_s_and_ind - AttributeError: module 'crowsetta.formats.seq.generic' has no attribute 'annot2df'
FAILED test_formats/test_seq/test_generic.py::TestAnnot2DfFunction::test_annot2df_onset_offset_sample_only - AttributeError: module 'crowsetta.formats.seq.generic' has no attribute 'annot2df'
FAILED test_formats/test_seq/test_generic.py::TestGenericSeqClass::test_annot2df_onset_offset_s_only - AttributeError: 'GenericSeq' object has no attribute 'to_df'
FAILED test_formats/test_seq/test_generic.py::TestGenericSeqClass::test_to_df_onset_offset_s_and_ind - AttributeError: 'GenericSeq' object has no attribute 'to_df'
FAILED test_formats/test_seq/test_generic.py::TestGenericSeqClass::test_to_df_onset_offset_sample_only - AttributeError: 'GenericSeq' object has no attribute 'to_df'
================================================================================================================= 6 failed, 10483 passed in 135.09s (0:02:15) ==================================================================================================================
NickleDave commented 1 year ago

Hi @rhine3 thank you for catching this.

Could you say a little more about how you installed crowsetta?
Did you run pip install crowsetta or conda install crowsetta -c conda-forge?

Or did you install the development code directly into the environment, e.g. with pip install -e .?

If you installed of a package manager, then I think what might be going on is that you have a previous release, that does not include changes I made on the main branch.

Could you please try setting up a development environment as described here and then try running the tests with nox? (nox -s tests) Please let me know if you still get the same error then.

I'm not sure what else could cause it besides installing from a package manager. I can see the CI is passing on main and I do not get this error when I run tests locally.

you may want to update the sample issue text here to say "You can determine this in Python by importing crowsetta and running crowsetta.__version__"

Thank you, this is a good idea, I will add it to the issue template

rhine3 commented 1 year ago

Hi @NickleDave!

Long story short, you're right that I hadn't tried installing the development environment. I was just testing crowsetta directly from within the tests folder within an Anaconda environment that had crowsetta installed.

(I can't remember if the environment I was working with had crowsetta installed via pip or conda; I tried both of them in separate environments to verify they both worked.)

Once I actually got the development environment installed, all tests succeeded, except the command I had to run to activate the environment was . ./.venv/bin/activate (the documentation says to use the command . ./.venv/activate.

I did get one warning when running the tests:

../src/crowsetta/_vendor/textgrid/textgrid.py:562
  /Users/tessa/Downloads/crowsetta_review/crowsetta/src/crowsetta/_vendor/textgrid/textgrid.py:562: DeprecationWarning: invalid escape sequence \w
    m = re.match('File type = "([\w ]+)"', header)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
NickleDave commented 1 year ago

Fixed the snippet that shows how to print crowsetta.__version__ in 58bb6f1

NickleDave commented 1 year ago

Made strings for regex raw to fix DeprecationWarning, in aca85f1

NickleDave commented 1 year ago

@all-contributors please add @rhine3 for doc and bug

allcontributors[bot] commented 1 year ago

@NickleDave

I've put up a pull request to add @rhine3! :tada:

NickleDave commented 1 year ago

Closing this

NickleDave commented 1 year ago

@all-contributors please also add @rhine3 for userTesting ideas

allcontributors[bot] commented 1 year ago

@NickleDave

I've put up a pull request to add @rhine3! :tada: