JulesBelveze / concepcy

💫 SpaCy wrapper for ConceptNet 💫
https://julesbelveze.github.io/concepcy/
MIT License
88 stars 4 forks source link

How to use name entity? #20

Open FYYFU opened 1 year ago

FYYFU commented 1 year ago

Hi: After reading the doc, i found it mentioned that :

One can notice that some words are missing. Indeed, some words might not be related to any other node from the ConceptNet base. Also, to diminish noise we have filtered out stop words, punctuation and named entities from being enriched with semantical information.

However, i do not want to filter the name entity. So i try to install the concepcy from source. But i got an error:

ValueError: [E964] The pipeline component factory for 'concepcy' needs to have the following named arguments, which are passed in by spaCy: nlp: receives the current nlp object and lets you access the vocab name: the name of the component instance, can be used to identify the component, output losses etc.

As i'm not familar with Spacy, i wonder if this is any solution to finish my goal? (do not filter those name entities)

Thanks!

JulesBelveze commented 1 year ago

Hey @FYYFU thanks for your interest! You should be build the package locally by just cloning the repo and running poetry install, let me know if this doesn't work.

Then the one temporary solution is to remove the token.ent_type != 0 here. I will change this behaviour and make the named entity filtering a configuration parameter.

FYYFU commented 1 year ago

Hey @FYYFU thanks for your interest! You should be build the package locally by just cloning the repo and running poetry install, let me know if this doesn't work.

Then the one temporary solution is to remove the token.ent_type != 0 here. I will change this behaviour and make the named entity filtering a configuration parameter.

Thanks for you reply!

I can build the package by cloning and running poetry install.
However, when i use the concepy like:

import spacy import concepcy nlp = spacy.load('en_core_web_sm') nlp.add_pipe("concepcy")

The reported error is:

ValueError: [E964] The pipeline component factory for 'concepcy' needs to have the following named arguments, which are passed in by spaCy: nlp: receives the current nlp object and lets you access the vocab name: the name of the component instance, can be used to identify the component, output losses etc.

JulesBelveze commented 1 year ago

Oh alright! My plate is a bit full these days but I'll try to have a look it on Wednesday or Thursday and get back to you 😸

FYYFU commented 1 year ago

Perhaps, The ConcepCyComponent in __init__.py needs to add another parameter, Like this:

def --init--( self, nlp: Language, name: str, url: str, relations_of_interest: List[str], as_dict: bool, filter_edge_weight: Optional[int] = None, filter_missing_text: Optional[bool] = None, ):

After adding this parameter, i can build this project from source and use it.

I get the information from (Component factories and stateful components)

Is this operation right? :)