wikit-ai / olaf

6 stars 1 forks source link

Check and test Spacy pipeline components registration #19

Closed ba-talibe closed 5 months ago

ba-talibe commented 5 months ago

When running the code to register the TokenSelectorComponent component after already running it

@spacy.language.Language.factory("token_selector")
class TokenSelectorComponent:
...

I get an error: TypeError: <class '__main__.TokenSelectorComponent'> is a built-in class

I am not sure how Spacy deals with registering pieces of code as pipeline components. I suspect it is now set up in my virtual environment. We should test it on a blank virtual env. And we might have to define a specific process to ensure all defined pipeline components are registered when running the app.

My little understanding is that Spacy register the name of the component and links it to the code location so that it can later be used and the pipeline can be serialized. Some potential resources to better understand: