Closed proycon closed 2 months ago
@tenzin3 @ngawangtrinley This has now been implemented (in stam-rust 0.15.0, stam-tools 0.8.0, stam-python 0.9.0). The @include
mechanism is extended to allow including other annotation stores as dependencies. The STAM JSON specification here explains how this works.
The implementation should still be considered a bit experimental though, as it hasn't yet been thoroughly tested in real use cases like yours, so it's possible that some bugs may still surface or that some more API methods are desired.
When creating multiple annotation stores that have dependencies, it is recommended to first create the dependencies independently, and then later include them from the 'superstore' using this new method: https://stam-python.readthedocs.io/en/latest/autoapi/stam/index.html#stam.AnnotationStore.add_substore .
@proycon. Thank you for implementing this feature so quickly. We will test this out and get back to you if we are facing any issues .
Currently an annotation store in STAM JSON can reference annotation datasets and resources in separate stand-off files. What is not yet possible, however, is to reference annotations defined in other STAM JSON annotation stores.
This use-case was raised in #21 by @tenzin3, see the lead up discussion there.
In such a case, an annotation in
store_a.store.stam.json
makes reference (via an annotation selector) to an annotation defined instore_b.store.stam.json
. That is currently not possible. I do think it is a fair use case and more flexibility in using stand-off files fits nicely with STAM's stand-off philosophy.This issue proposes to expand the STAM model to allow this:
@include
mechanism in STAM JSON would be extended to allow including other annotation stores. In effect, an annotation store can then depend on on another by importing it, these includes are executed before loading any of its own annotations. Recursive includes would be allowed (allowing more complex dependency chains), but cyclic includes would be explicitly forbidden! Includes may (and are in fact encouraged to) reference the same stand-off resources and annotation data sets.Possible syntax for this:
AnnotationStore
instance to work with at any given time. This would however serialize to multiple files. This requires some extra bookkeeping to be implemented, as for each annotation we need to know to what annotation store it should go. The implementation might define 'substores' and keep map filenames to lists of annotation handles. This new bookkeeping would at the same time make splitting stores easier than it is in the currently implementation (where splitting is basically a fairly expensive deletion action). Merging and splitting becomes more reversible.