Imageomics / cautious-robot

Simple images from CSV downloader that runs and records checksums on downloaded image folder.
MIT License
2 stars 0 forks source link

Add more test cases for cautious-robot #20

Closed zoeduan closed 3 months ago

zoeduan commented 3 months ago

Added two test cases: test_download_images.py and test_buddycheck.py

zoeduan commented 3 months ago

Hello @egrace479 @thompsonmj, I have updated the test cases, and now all the test cases run clear. Thanks for review the code!

thompsonmj commented 3 months ago

In test_download_images.py, I'd add wait=0.1 to the test_success_after_retries and test_failure_after_retries tests to make tests go more quickly.

thompsonmj commented 3 months ago

I also receive 1 error:

FAILED tests/test_buddycheck.py::TestBuddyCheck::test_check_alignment_empty_img_df - ValueError: Unable to fill values because RangeIndex cannot contain NA

Details:

=================================================================== FAILURES ====================================================================
_______________________________________________ TestBuddyCheck.test_check_alignment_empty_img_df ________________________________________________

self = <test_buddycheck.TestBuddyCheck testMethod=test_check_alignment_empty_img_df>

    def test_check_alignment_empty_img_df(self):
        source_df = pd.DataFrame(columns=['filename', 'checksum'])
        checksum_df = pd.read_csv(self.checksum_source_file.name)
>       merged_df = self.buddy_check.merge_on_filename_checksum(source_df, checksum_df, 'filename', 'checksum')

tests/test_buddycheck.py:136: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/cautiousrobot/buddy_check.py:34: in merge_on_filename_checksum
    merged_df = pd.merge(source_df,
../../.local/lib/python3.8/site-packages/pandas/core/reshape/merge.py:156: in merge
    return op.get_result(copy=copy)
../../.local/lib/python3.8/site-packages/pandas/core/reshape/merge.py:813: in get_result
    self._maybe_add_join_keys(result, left_indexer, right_indexer)
../../.local/lib/python3.8/site-packages/pandas/core/reshape/merge.py:986: in _maybe_add_join_keys
    rvals = algos.take_nd(taker, right_indexer, fill_value=rfill)
../../.local/lib/python3.8/site-packages/pandas/core/array_algos/take.py:110: in take_nd
    return arr.take(
../../.local/lib/python3.8/site-packages/pandas/core/indexes/base.py:1084: in take
    allow_fill = self._maybe_disallow_fill(allow_fill, fill_value, indices)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = RangeIndex(start=0, stop=3, step=1), allow_fill = True, fill_value = 0, indices = array([], dtype=int64)

    @final
    def _maybe_disallow_fill(self, allow_fill: bool, fill_value, indices) -> bool:
        """
        We only use pandas-style take when allow_fill is True _and_
        fill_value is not None.
        """
        if allow_fill and fill_value is not None:
            # only fill if we are passing a non-None fill_value
            if self._can_hold_na:
                if (indices < -1).any():
                    raise ValueError(
                        "When allow_fill=True and fill_value is not None, "
                        "all indices must be >= -1"
                    )
            else:
                cls_name = type(self).__name__
>               raise ValueError(
                    f"Unable to fill values because {cls_name} cannot contain NA"
                )
E               ValueError: Unable to fill values because RangeIndex cannot contain NA

../../.local/lib/python3.8/site-packages/pandas/core/indexes/base.py:1117: ValueError
zoeduan commented 3 months ago

In test_download_images.py, I'd add wait=0.1 to the test_success_after_retries and test_failure_after_retries tests to make tests go more quickly.

Thank you! I updated test_download_images.py, it runs much faster!! Will commit my changes later.

egrace479 commented 3 months ago

I also receive 1 error:

FAILED tests/test_buddycheck.py::TestBuddyCheck::test_check_alignment_empty_img_df - ValueError: Unable to fill values because RangeIndex cannot contain NA

Details: ...

Strange, @thompsonmj, I see everything passing.

I very much like your wait = .1 suggestion for the retry tests!

zoeduan commented 3 months ago

I also receive 1 error:

FAILED tests/test_buddycheck.py::TestBuddyCheck::test_check_alignment_empty_img_df - ValueError: Unable to fill values because RangeIndex cannot contain NA

Details: ...

Strange, @thompsonmj, I see everything passing.

I very much like your wait = .1 suggestion for the retry tests!

Same with me.

thompsonmj commented 3 months ago

On the issue of the test_check_alignment_empty_img_df test, when I execute that test with pytest, it raises the ValueError, but the test passes when run with unittest, which seems odd to me.

Regardless though, since the merge_on_filename_checksum method is only called in validate_download, we could add an assertion that neither input DataFrame for that method is empty since it would mean the user is doing something that doesn't match their expectations. Then we could remove the tests for empty inputs to merge_on_filename_checksum and add a test for validate_download with empty inputs, asserting that a custom exception is raised.

How does that sound?