ucam-department-of-psychiatry / crate

Create and use de-identified research databases. Preprocess, extract text, anonymise/de-identify, link, apply natural language processing, query for research, manage consent for contact.
GNU General Public License v3.0
19 stars 7 forks source link

Alcohol units NLP #117

Closed martinburchell closed 1 year ago

martinburchell commented 1 year ago

Separated from #111

@RudolfCardinal This looks fine to me.

The pytest way of running the same test for several parameters is pytest.mark.parametrize. I couldn't see a simple example in the pytest docs but you'd do something like:

def get_row():
    data = [
        ("Alcohol", no_results),
        ("He used to drink like a fish", no_results),
        ("[e.g. insulin] currently 6 units per week", no_results),
        ...
    ]
    for row in data:
        yield row

@pytest.mark.parametrize("row", get_row())
def test_for_all_rows(row):
     ...

I'm not suggesting that we change our tests now but might be something to consider in the future.