Closed mart-r closed 4 months ago
env snapshotting happens as part of model pack creation right? should be fine to include t-dep computation, as hash computation is most time consuming
Yes, that's when it happens. Not much harm, really. Even if the faster hashing is used (which nearly instant), saving the CDB on disk will take far longer than the fractions of the second finding the transitive dependencies would take. (though haven't looked at the exact timings).
Does this change integrate with the 'notification' of loading a model pack that is not the 'same' environment?
I don't think there's currently anything in MedCAT that does that.
spaCy
does warn when loading an older version of their model that's technically not (guaranteed to be) compatible with the current spaCy
version. But I don't recall us checking the environment in any meaningful way.
Add all transitive dependencies to the environmental snapshot.
We added transitive dependencies to MedCAT in v1.12 (#438). However, that only included direct dependencies of MedCAT. In principle, the compatible versions of transitive dependencies should be handled by the package that defines them. However, these things can change over time and there may be a situation where the requirements ranges previously set no longer work properly. As such, by including all the transitive dependencies (and their versions) in the environmental snapshot could help us recreate an environment where we'd expect a model to work more confidently.
There's a few decisions I made that could very well be open to change:
The change to the default environmental snapshot as a result of the PR is as follows:
NOTE: The default JSON format would be different (more concise) in either case.