nwiltsie commented 5 months ago

Description

This PR rewrites most of the internals of NFTestAssert in a way that is (mostly) invisible to the outside world. This is the key interface change:

@@ -109,12 +108,9 @@ class NFTestCase:
             self._logger.error(" [ failed ]")
             return False

-        for ass in self.asserts:
+        for assertion in self.asserts:
             try:
-                ass.identify_assertion_files()
-                ass.assert_exists()
-                ass.assert_updated()
-                ass.assert_expected()
+                assertion.perform_assertions()
             except Exception as error:
                 self._logger.error(error.args)
                 self._logger.error(" [ failed ]")

The new perform_assertions() method hides all of the details of identify_assertion_files(), assert_exists(), assert_updated(), and assert_expected(). Decoupling the interface from the implementation details like that makes it easier to test and maintain this code.

In that spirit, I've also re-written all of the tests to test the functionality of the class rather than the implementation details. There are a variety of test fixtures that create an NFTestAssert instance pointing to reference and updated files in a temporary directory. The fixtures are parameterized to test the following situations:

Usage of a custom script
The actual file was/wasn't updated after the NFTestAssert was created
There are 0, 1, or 2 files matching the actual glob
There are 0, 1, or 2 files matching the expect glob
The actual and expect files do/don't have the same content

Those parameters are matched with appropriate xfail marks to confirm that the tests do/don't pass in the expected situations. (Aside: by default xfail has strict=False, meaning that tests that unexpectedly pass do not raise errors. That is bonkers.)

Checklist

[x] This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded.
Disclosing PHI is a major problem[^1] - Even a small leak can be costly[^2].
[x] This PR does NOT contain germline genetic data[^3], RNA-Seq, DNA methylation, microbiome or other molecular data[^4].

[^1]: UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records [^2]: The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records. [^3]: Genetic information is considered PHI. Forensic assays can identify patients with as few as 21 SNPs [^4]: RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity.

[x] This PR does NOT contain other non-plain text files, such as: compressed files, images (e.g. .png, .jpeg), .pdf, .RData, .xlsx, .doc, .ppt, or other output files.

To automatically exclude such files using a .gitignore file, see here for example.

[x] I have read the code review guidelines and the code review best practice on GitHub check-list.
[x] I have set up or verified the main branch protection rule following the github standards before opening this pull request.
[x] The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
[x] I have added the major changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.

nwiltsie commented 5 months ago

Argh, that's what I get for only running my new tests.

nwiltsie commented 5 months ago

Hmm, I'm not sure why this isn't working in the cloud - it works perfectly for me locally:

$ pytest
======================================== test session starts ========================================
platform darwin -- Python 3.11.4, pytest-7.4.0, pluggy-1.0.0
rootdir: /Users/nwiltsie/src/tool-NFTest
collected 162 items

test/unit/test_NFTestAssert.py xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx..xxxxxxxxxxxxxxxxxxxxxxxxxxxx [ 38%]
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx..xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx            [ 88%]
test/unit/test_NFTestCase.py ..                                                               [ 90%]
test/unit/test_NFTestEnv.py ..                                                                [ 91%]
test/unit/test_NFTestRunner.py .                                                              [ 91%]
test/unit/test_common.py .........                                                            [ 97%]
test/unit/test_syslog.py ....                                                                 [100%]

================================== 22 passed, 140 xfailed in 1.96s ==================================

nwiltsie commented 5 months ago

Well, that's irritating. These log lines from this test from 4e71965d0cdb45255c872c1c7c10c57ac75b62a2 show that the filetimes are all wonky in GitHub Actions:

DEBUG    NFTest:NFTestAssert.py:44 Actual path `/tmp/pytest-of-runner/pytest-0/test_nftest_assert_True_md5_1_11/*.actual` resolved to `/tmp/pytest-of-runner/pytest-0/test_nftest_assert_True_md5_1_11/0.actual`
DEBUG    NFTest:NFTestAssert.py:48 Expected path `/tmp/pytest-of-runner/pytest-0/test_nftest_assert_True_md5_1_11/*.expect` resolved to `/tmp/pytest-of-runner/pytest-0/test_nftest_assert_True_md5_1_11/0.expect`
DEBUG    NFTest:NFTestAssert.py:57 Test creation time: 2024-06-24 23:30:14.720138+00:00
DEBUG    NFTest:NFTestAssert.py:58 Actual mod time:    2024-06-24 23:30:14.716686+00:00

The only fix I can find is to insert time.sleep(0.01) calls in the cases the file needs to be updated, even in the case where the file is explicitly created after the NFTestAssert.

nwiltsie commented 5 months ago

Clearly there's something else going on under the hood. I'll dig in and try again later.

uclahs-cds / tool-NFTest

Encapsulate NFTestAssert class, rewrite tests #67

Description

Checklist