Open MartinKl opened 2 months ago
When running in disk mode, this line sometimes yields true and sometimes false for the node holding the lemma annotation: https://github.com/korpling/annatto/blob/78b4d0471f13a97f2b6c3ce4efd403dc22977693/src/exporter/xlsx.rs#L80
In contrast, in memory mode, it always returns false
Importing the graphml file with annis cli, a query tok _ident_ auto_lemma
returns 0 matches.
All of this together points to a not completely updated storage, probably the storage of the Coverage component which influences the result of is_token
We figured out, that this happens because Coverage
components can get unloaded in workflow steps that use a CorpusStorage
. Even though AQL queries can now be executed on graphs directly, which would avoid unloading, removing CorpusStorage
s is not an option right now, since a lot of graph_op
s rely on the correct order of results which only CorpusStorage
provides.
If run in memory, the either CorpusStorage
does not unload the Coverage
component or is simply fast enough in reloading it again, so the bug does not occur.
When running annatto in disk mode, annotation layers seem to get lost in a setting with lots of manipulations (it's possible that export starts before the annotation storage is ready). Running in memory mode, though, everything passes.
The following workflow failed (it's not about the details, but about the complexity) to export annotation
norm::auto_lemma
when run on disk, but succeeded in memory. Note that only the export toxlsx
showed this behaviour, the graphml file always contained the lemma layer.