Add option (--edit-distance) to specify name edit distance as a tuple (edit distance, random number generator seed, and number of picks).
We need to pick a subset of the edits since otherwise there can be thousands of edits checked - and that would slow down the process considerably.
And to avoid run-to-run variance we need to be able to specify a random seed for the picks.
Examples
Example without edit-distance:
```
$ python -m medcat.utils.regression.regression_checker models/20230227__kch_gstt_trained_model_494c3717f637bb89.zip --example-strictness None
Loading RegressionChecker from yaml: configs/default_regression_tests.yml
Loading model pack from file: models/20230227__kch_gstt_trained_model_494c3717f637bb89.zip
Checking the current status
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [02:39<00:00, 1.26it/s]
A total of 2 parts were kept track of within the group "default_regression_tests.yml".
And a total of 4469 (sub)cases were checked.
At the strictness level of Strictness.NORMAL (allowing ['FOUND_ANY_CHILD', 'BIGGER_SPAN_BOTH', 'BIGGER_SPAN_RIGHT', 'BIGGER_SPAN_LEFT', 'PARTIAL_OVERLAP', 'SMALLER_SPAN', 'IDENTICAL', 'FOUND_CHILD_PARTIAL']):
The number of total successful (sub) cases: 4289 (95.97%)
The number of total failing (sub) cases : 180 ( 4.03%)
IDENTICAL : 4234 (94.74%)
SMALLER_SPAN : 1 ( 0.02%)
FOUND_DIR_PARENT : 27 ( 0.60%)
FOUND_ANY_CHILD : 54 ( 1.21%)
FOUND_OTHER : 132 ( 2.95%)
FAIL : 21 ( 0.47%)
Tested 'test-case-1' for a total of 665 cases:
IDENTICAL : 649 (97.59%)
SMALLER_SPAN : 1 ( 0.15%)
FOUND_ANY_CHILD : 7 ( 1.05%)
FOUND_OTHER : 2 ( 0.30%)
FAIL : 6 ( 0.90%)
Tested 'test-case-2' for a total of 3804 cases:
IDENTICAL : 3585 (94.24%)
FOUND_DIR_PARENT : 27 ( 0.71%)
FOUND_ANY_CHILD : 47 ( 1.24%)
FOUND_OTHER : 130 ( 3.42%)
FAIL : 15 ( 0.39%)
```
Example with edit distance:
```
$ python -m medcat.utils.regression.regression_checker models/20230227__kch_gstt_trained_model_494c3717f637bb89.zip --example-strictness None --edit-distance "(1,42,2)"
Loading RegressionChecker from yaml: configs/default_regression_tests.yml
Loading model pack from file: models/20230227__kch_gstt_trained_model_494c3717f637bb89.zip
Checking the current status
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [06:30<00:00, 1.95s/it]
A total of 2 parts were kept track of within the group "default_regression_tests.yml".
And a total of 8928 (sub)cases were checked.
At the strictness level of Strictness.NORMAL (allowing ['SMALLER_SPAN', 'BIGGER_SPAN_RIGHT', 'FOUND_ANY_CHILD', 'FOUND_CHILD_PARTIAL', 'BIGGER_SPAN_LEFT', 'BIGGER_SPAN_BOTH', 'IDENTICAL', 'PARTIAL_OVERLAP']):
The number of total successful (sub) cases: 7419 (83.10%)
The number of total failing (sub) cases : 1509 (16.90%)
IDENTICAL : 6772 (75.85%)
SMALLER_SPAN : 546 ( 6.12%)
FOUND_DIR_PARENT : 25 ( 0.28%)
FOUND_ANY_CHILD : 88 ( 0.99%)
FOUND_CHILD_PARTIAL : 13 ( 0.15%)
FOUND_OTHER : 264 ( 2.96%)
FAIL : 1220 (13.66%)
Tested 'test-case-1' for a total of 1329 cases:
IDENTICAL : 976 (73.44%)
SMALLER_SPAN : 95 ( 7.15%)
FOUND_ANY_CHILD : 11 ( 0.83%)
FOUND_CHILD_PARTIAL : 3 ( 0.23%)
FOUND_OTHER : 4 ( 0.30%)
FAIL : 240 (18.06%)
Tested 'test-case-2' for a total of 7599 cases:
IDENTICAL : 5796 (76.27%)
SMALLER_SPAN : 451 ( 5.93%)
FOUND_DIR_PARENT : 25 ( 0.33%)
FOUND_ANY_CHILD : 77 ( 1.01%)
FOUND_CHILD_PARTIAL : 10 ( 0.13%)
FOUND_OTHER : 260 ( 3.42%)
FAIL : 980 (12.90%)
```
OR with the 2024-06 Snomed model (only trained on GSTT data).
Without edit distance:
```
python -m medcat.utils.regression.regression_checker models/Snomed2024-06-gstt-trained_ae5b08e0fb5310b2.zip --example-strictness None
Loading RegressionChecker from yaml: configs/default_regression_tests.yml
Loading model pack from file: models/Snomed2024-06-gstt-trained_ae5b08e0fb5310b2.zip
Checking the current status
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:47<00:00, 4.21it/s]
A total of 2 parts were kept track of within the group "default_regression_tests.yml".
And a total of 4680 (sub)cases were checked.
At the strictness level of Strictness.NORMAL (allowing ['FOUND_ANY_CHILD', 'FOUND_CHILD_PARTIAL', 'BIGGER_SPAN_RIGHT', 'IDENTICAL', 'BIGGER_SPAN_BOTH', 'BIGGER_SPAN_LEFT', 'PARTIAL_OVERLAP', 'SMALLER_SPAN']):
The number of total successful (sub) cases: 4311 (92.12%)
The number of total failing (sub) cases : 369 ( 7.88%)
IDENTICAL : 4161 (88.91%)
SMALLER_SPAN : 2 ( 0.04%)
FOUND_DIR_PARENT : 191 ( 4.08%)
FOUND_DIR_GRANDPARENT : 18 ( 0.38%)
FOUND_ANY_CHILD : 148 ( 3.16%)
FOUND_OTHER : 137 ( 2.93%)
FAIL : 23 ( 0.49%)
Tested 'test-case-1' for a total of 756 cases:
IDENTICAL : 730 (96.56%)
SMALLER_SPAN : 2 ( 0.26%)
FOUND_ANY_CHILD : 5 ( 0.66%)
FOUND_OTHER : 18 ( 2.38%)
FAIL : 1 ( 0.13%)
Tested 'test-case-2' for a total of 3924 cases:
IDENTICAL : 3431 (87.44%)
FOUND_DIR_PARENT : 191 ( 4.87%)
FOUND_DIR_GRANDPARENT : 18 ( 0.46%)
FOUND_ANY_CHILD : 143 ( 3.64%)
FOUND_OTHER : 119 ( 3.03%)
FAIL : 22 ( 0.56%)
```
With edit distance:
```
$ python -m medcat.utils.regression.regression_checker models/Snomed2024-06-gstt-trained_ae5b08e0fb5310b2.zip --example-strictness None --edit-distance "(1,42,2)"
Loading RegressionChecker from yaml: configs/default_regression_tests.yml
Loading model pack from file: models/Snomed2024-06-gstt-trained_ae5b08e0fb5310b2.zip
Checking the current status
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [02:00<00:00, 1.66it/s]
A total of 2 parts were kept track of within the group "default_regression_tests.yml".
And a total of 9354 (sub)cases were checked.
At the strictness level of Strictness.NORMAL (allowing ['PARTIAL_OVERLAP', 'BIGGER_SPAN_LEFT', 'BIGGER_SPAN_BOTH', 'FOUND_CHILD_PARTIAL', 'BIGGER_SPAN_RIGHT', 'SMALLER_SPAN', 'FOUND_ANY_CHILD', 'IDENTICAL']):
The number of total successful (sub) cases: 6149 (65.74%)
The number of total failing (sub) cases : 3205 (34.26%)
IDENTICAL : 5415 (57.89%)
SMALLER_SPAN : 564 ( 6.03%)
FOUND_DIR_PARENT : 241 ( 2.58%)
FOUND_DIR_GRANDPARENT : 20 ( 0.21%)
FOUND_ANY_CHILD : 151 ( 1.61%)
FOUND_CHILD_PARTIAL : 19 ( 0.20%)
FOUND_OTHER : 141 ( 1.51%)
FAIL : 2803 (29.97%)
Tested 'test-case-1' for a total of 1512 cases:
IDENTICAL : 948 (62.70%)
SMALLER_SPAN : 94 ( 6.22%)
FOUND_CHILD_PARTIAL : 3 ( 0.20%)
FOUND_OTHER : 17 ( 1.12%)
FAIL : 450 (29.76%)
Tested 'test-case-2' for a total of 7842 cases:
IDENTICAL : 4467 (56.96%)
SMALLER_SPAN : 470 ( 5.99%)
FOUND_DIR_PARENT : 241 ( 3.07%)
FOUND_DIR_GRANDPARENT : 20 ( 0.26%)
FOUND_ANY_CHILD : 151 ( 1.93%)
FOUND_CHILD_PARTIAL : 16 ( 0.20%)
FOUND_OTHER : 124 ( 1.58%)
FAIL : 2353 (30.01%)
```
Add option (
--edit-distance
) to specify name edit distance as a tuple (edit distance, random number generator seed, and number of picks).We need to pick a subset of the edits since otherwise there can be thousands of edits checked - and that would slow down the process considerably.
And to avoid run-to-run variance we need to be able to specify a random seed for the picks.
Examples
Example without edit-distance: ``` $ python -m medcat.utils.regression.regression_checker models/20230227__kch_gstt_trained_model_494c3717f637bb89.zip --example-strictness None Loading RegressionChecker from yaml: configs/default_regression_tests.yml Loading model pack from file: models/20230227__kch_gstt_trained_model_494c3717f637bb89.zip Checking the current status 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [02:39<00:00, 1.26it/s] A total of 2 parts were kept track of within the group "default_regression_tests.yml". And a total of 4469 (sub)cases were checked. At the strictness level of Strictness.NORMAL (allowing ['FOUND_ANY_CHILD', 'BIGGER_SPAN_BOTH', 'BIGGER_SPAN_RIGHT', 'BIGGER_SPAN_LEFT', 'PARTIAL_OVERLAP', 'SMALLER_SPAN', 'IDENTICAL', 'FOUND_CHILD_PARTIAL']): The number of total successful (sub) cases: 4289 (95.97%) The number of total failing (sub) cases : 180 ( 4.03%) IDENTICAL : 4234 (94.74%) SMALLER_SPAN : 1 ( 0.02%) FOUND_DIR_PARENT : 27 ( 0.60%) FOUND_ANY_CHILD : 54 ( 1.21%) FOUND_OTHER : 132 ( 2.95%) FAIL : 21 ( 0.47%) Tested 'test-case-1' for a total of 665 cases: IDENTICAL : 649 (97.59%) SMALLER_SPAN : 1 ( 0.15%) FOUND_ANY_CHILD : 7 ( 1.05%) FOUND_OTHER : 2 ( 0.30%) FAIL : 6 ( 0.90%) Tested 'test-case-2' for a total of 3804 cases: IDENTICAL : 3585 (94.24%) FOUND_DIR_PARENT : 27 ( 0.71%) FOUND_ANY_CHILD : 47 ( 1.24%) FOUND_OTHER : 130 ( 3.42%) FAIL : 15 ( 0.39%) ``` Example with edit distance: ``` $ python -m medcat.utils.regression.regression_checker models/20230227__kch_gstt_trained_model_494c3717f637bb89.zip --example-strictness None --edit-distance "(1,42,2)" Loading RegressionChecker from yaml: configs/default_regression_tests.yml Loading model pack from file: models/20230227__kch_gstt_trained_model_494c3717f637bb89.zip Checking the current status 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [06:30<00:00, 1.95s/it] A total of 2 parts were kept track of within the group "default_regression_tests.yml". And a total of 8928 (sub)cases were checked. At the strictness level of Strictness.NORMAL (allowing ['SMALLER_SPAN', 'BIGGER_SPAN_RIGHT', 'FOUND_ANY_CHILD', 'FOUND_CHILD_PARTIAL', 'BIGGER_SPAN_LEFT', 'BIGGER_SPAN_BOTH', 'IDENTICAL', 'PARTIAL_OVERLAP']): The number of total successful (sub) cases: 7419 (83.10%) The number of total failing (sub) cases : 1509 (16.90%) IDENTICAL : 6772 (75.85%) SMALLER_SPAN : 546 ( 6.12%) FOUND_DIR_PARENT : 25 ( 0.28%) FOUND_ANY_CHILD : 88 ( 0.99%) FOUND_CHILD_PARTIAL : 13 ( 0.15%) FOUND_OTHER : 264 ( 2.96%) FAIL : 1220 (13.66%) Tested 'test-case-1' for a total of 1329 cases: IDENTICAL : 976 (73.44%) SMALLER_SPAN : 95 ( 7.15%) FOUND_ANY_CHILD : 11 ( 0.83%) FOUND_CHILD_PARTIAL : 3 ( 0.23%) FOUND_OTHER : 4 ( 0.30%) FAIL : 240 (18.06%) Tested 'test-case-2' for a total of 7599 cases: IDENTICAL : 5796 (76.27%) SMALLER_SPAN : 451 ( 5.93%) FOUND_DIR_PARENT : 25 ( 0.33%) FOUND_ANY_CHILD : 77 ( 1.01%) FOUND_CHILD_PARTIAL : 10 ( 0.13%) FOUND_OTHER : 260 ( 3.42%) FAIL : 980 (12.90%) ``` OR with the 2024-06 Snomed model (only trained on GSTT data). Without edit distance: ``` python -m medcat.utils.regression.regression_checker models/Snomed2024-06-gstt-trained_ae5b08e0fb5310b2.zip --example-strictness None Loading RegressionChecker from yaml: configs/default_regression_tests.yml Loading model pack from file: models/Snomed2024-06-gstt-trained_ae5b08e0fb5310b2.zip Checking the current status 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:47<00:00, 4.21it/s] A total of 2 parts were kept track of within the group "default_regression_tests.yml". And a total of 4680 (sub)cases were checked. At the strictness level of Strictness.NORMAL (allowing ['FOUND_ANY_CHILD', 'FOUND_CHILD_PARTIAL', 'BIGGER_SPAN_RIGHT', 'IDENTICAL', 'BIGGER_SPAN_BOTH', 'BIGGER_SPAN_LEFT', 'PARTIAL_OVERLAP', 'SMALLER_SPAN']): The number of total successful (sub) cases: 4311 (92.12%) The number of total failing (sub) cases : 369 ( 7.88%) IDENTICAL : 4161 (88.91%) SMALLER_SPAN : 2 ( 0.04%) FOUND_DIR_PARENT : 191 ( 4.08%) FOUND_DIR_GRANDPARENT : 18 ( 0.38%) FOUND_ANY_CHILD : 148 ( 3.16%) FOUND_OTHER : 137 ( 2.93%) FAIL : 23 ( 0.49%) Tested 'test-case-1' for a total of 756 cases: IDENTICAL : 730 (96.56%) SMALLER_SPAN : 2 ( 0.26%) FOUND_ANY_CHILD : 5 ( 0.66%) FOUND_OTHER : 18 ( 2.38%) FAIL : 1 ( 0.13%) Tested 'test-case-2' for a total of 3924 cases: IDENTICAL : 3431 (87.44%) FOUND_DIR_PARENT : 191 ( 4.87%) FOUND_DIR_GRANDPARENT : 18 ( 0.46%) FOUND_ANY_CHILD : 143 ( 3.64%) FOUND_OTHER : 119 ( 3.03%) FAIL : 22 ( 0.56%) ``` With edit distance: ``` $ python -m medcat.utils.regression.regression_checker models/Snomed2024-06-gstt-trained_ae5b08e0fb5310b2.zip --example-strictness None --edit-distance "(1,42,2)" Loading RegressionChecker from yaml: configs/default_regression_tests.yml Loading model pack from file: models/Snomed2024-06-gstt-trained_ae5b08e0fb5310b2.zip Checking the current status 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [02:00<00:00, 1.66it/s] A total of 2 parts were kept track of within the group "default_regression_tests.yml". And a total of 9354 (sub)cases were checked. At the strictness level of Strictness.NORMAL (allowing ['PARTIAL_OVERLAP', 'BIGGER_SPAN_LEFT', 'BIGGER_SPAN_BOTH', 'FOUND_CHILD_PARTIAL', 'BIGGER_SPAN_RIGHT', 'SMALLER_SPAN', 'FOUND_ANY_CHILD', 'IDENTICAL']): The number of total successful (sub) cases: 6149 (65.74%) The number of total failing (sub) cases : 3205 (34.26%) IDENTICAL : 5415 (57.89%) SMALLER_SPAN : 564 ( 6.03%) FOUND_DIR_PARENT : 241 ( 2.58%) FOUND_DIR_GRANDPARENT : 20 ( 0.21%) FOUND_ANY_CHILD : 151 ( 1.61%) FOUND_CHILD_PARTIAL : 19 ( 0.20%) FOUND_OTHER : 141 ( 1.51%) FAIL : 2803 (29.97%) Tested 'test-case-1' for a total of 1512 cases: IDENTICAL : 948 (62.70%) SMALLER_SPAN : 94 ( 6.22%) FOUND_CHILD_PARTIAL : 3 ( 0.20%) FOUND_OTHER : 17 ( 1.12%) FAIL : 450 (29.76%) Tested 'test-case-2' for a total of 7842 cases: IDENTICAL : 4467 (56.96%) SMALLER_SPAN : 470 ( 5.99%) FOUND_DIR_PARENT : 241 ( 3.07%) FOUND_DIR_GRANDPARENT : 20 ( 0.26%) FOUND_ANY_CHILD : 151 ( 1.93%) FOUND_CHILD_PARTIAL : 16 ( 0.20%) FOUND_OTHER : 124 ( 1.58%) FAIL : 2353 (30.01%) ```