inception-project / inception

INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.
https://inception-project.github.io
Apache License 2.0
589 stars 148 forks source link

Per project or installation recommender #2266

Open antonyscerri opened 3 years ago

antonyscerri commented 3 years ago

Is your feature request related to a problem? Please describe. It is not documented but from observations the recommender (at least those built-in) appear to provide a per user instance (for the given project). This provides a very personalised aid to annotation tasks, however for collaborative (and reuse) it would be nice to have the option to share the recommender across users on the same task or across projects.

Describe the solution you'd like The ability to configure recommenders which will share the backend resources (dictionary, model) across multiple users to allow for collaborative development and reuse. This can be further extended to allow reusing a recommender between projects.

Describe alternatives you've considered This was based on trail and error testing INCEpTION as the documentation did not state the current behaviour. Currently its hard to simulate any of this without severely compromising the way you would use the tool.

Additional context Add any other context or screenshots about the feature request here.

jcklie commented 3 years ago

Thank you for the report. Where would you have expected to read about this limitation in the documentation? You can use external recommender for more complicated recommender needs, e.g. via https://github.com/inception-project/inception-external-recommender . There you can build it the way you want.

reckart commented 3 years ago

If you upload a gazeteer into a string matching recommender, it is shared between annotators.

The internally learned recommendations are per user to avoid users influencing and thereby potentially biasing each other. Also, they might create contradicting training data which could confuse the model.

antonyscerri commented 3 years ago

I agree there is the potential for bias/influence when mixing but under known conditions this can be of benefit for collaboration, so having the such an option would be useful.

I think in the basic user guide under recommenders, if it could just mention that any learnt information is per individual user.

I've been looking at the external recommender option and yes you would have the option to then pick and mix and reuse between projects of sort, care with managing document identifiers and dealing with multiple user conflicting annotations etc will become more of an issue and preventing further bias (training over the same document multiple times etc). So there could be more work required to fully utilise such an option. With external recommenders another option to consider there might be only sending modified documents (on the train call) rather than the complete set (which is what i've observed trying things out) as you may end up with a lot of data being resent with a larger corpus, of course the recommender would need to expect such behaviour as well.

I've also seen the recent addition for the potential to export information from built-in recommenders and that will help in certain scenarios too.

jcklie commented 3 years ago

I currently work on a next generation of external recommender that will be more efficient with document transmissions, but it will take a while as I am busy with research. If you want to build learning from multiple annotators into inception, then you would more or less need to curate automatically I think with our current models. There is lots of research for studying from multiple annotators, so it is possible to set up an external recommender that can do that. EACL2021 had a nice tutorial, see https://sites.google.com/view/alma-tutorial .