microsoft / spacy-ann-linker

spaCy pipeline component for generating spaCy KnowledgeBase Alias Candidates for Entity Linking
https://microsoft.github.io/spacy-ann-linker
MIT License
83 stars 23 forks source link

Support for spacy v3.0 #3

Open fcggamou opened 3 years ago

fcggamou commented 3 years ago

Hi,

Thanks a lot for the great work. Would this work with the upcoming version of Spacy 3.0?

Thanks!

svlandeg commented 3 years ago

I would be happy to help support v3 compatibility. In the spaCy code, there are two relevant places: https://github.com/explosion/spaCy/blob/develop/spacy/pipeline/entity_linker.py#L51 defines the current default candidate generator:

"get_candidates": {"@misc": "spacy.CandidateGenerator.v1"}

and set_kb() lets you load a custom KB from a given vocab: https://github.com/explosion/spaCy/blob/develop/spacy/pipeline/entity_linker.py#L146. This function is also called from entity_linker.initialize().

In the config, those settings can be defined as such:

[components.entity_linker]
factory = "entity_linker"
entity_vector_length = 64
get_candidates = {"@misc":"spacy.CandidateGenerator.v1"}
incl_context = true
incl_prior = true
labels_discard = []

[components.entity_linker.model]
@architectures = "spacy.EntityLinker.v1"
nO = null

[components.entity_linker.model.tok2vec]
@architectures = "spacy.HashEmbedCNN.v1"
...

[initialize.components.entity_linker.kb_loader]
@misc = "spacy.KBFromFile.v1"
kb_path = ${paths.kb}

So basically, you'd need to somehow implement those custom functions, then register them with spaCy, and they should become available in the config.

kabirkhan commented 3 years ago

Sorry for the late reply on this. spaCy v3 is in progress now, hoping to have support in the next couple weeks. @svlandeg thanks for the offer to help. I'd actually love to do the implementation myself to get into the details of spaCy v3 internals but I will absolutely reach out if I have questions.

Ibrokhimsadikov commented 2 years ago

Hello @kabirkhan,

Is spacy v3 supported now?

XBeg9 commented 2 years ago

Any updates here, would love to help