CogStack / MedCAT

Medical Concept Annotation Tool
Other
454 stars 105 forks source link

CU-869574kvp update snomed preprocessing naming #469

Closed mart-r closed 1 month ago

mart-r commented 4 months ago

As per #467

Using a regular expression to capture the release from the folder name.

Upon init, this check is non-strict (i.e no exception is raised if release is unable to be found) since it could refer to a parent folder.

When parsing through the subfolder, the check is string (i.e an exception is raised if no release is found) since those are expected to be Snomed release folders (with SnomedCT in the names).

Added some relevant tests for working and failing folder names. Both just the base names as well as longer paths. For both strict and non-strict mode.

tomolopolis commented 4 months ago

Task linked: CU-869574kvp Improve Snomed preprocessing folder detection

mart-r commented 4 months ago

This might need some additions:

mart-r commented 2 months ago

PS: Reworked the PR. Would need a new review.