explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python
https://spacy.io
MIT License
30.21k stars 4.4k forks source link

how to add parser for specific documents ? #2114

Closed meshiguge closed 6 years ago

meshiguge commented 6 years ago

as match_condition only need token and tagger, parser was disable in nlp.pipe how to add parser for specific documents ?

nlp = spacy.load('en', disable=['parser', 'ner',  'textcat'] )
lines = (  line.strip()  for line in open(file))
docs = nlp.pipe( lines, batch_size = ...  ) 
from spacy.pipeline import DependencyParser
parser = DependencyParser(nlp.vocab)

for doc in docs:
     if match_condition :
           doc_parser =  parser(doc)   # error cause
           do_something(doc_tag )

cause error

doc_parser = parser(doc) File "nn_parser.pyx", line 340, in spacy.syntax.nn_parser.Parser.call File "nn_parser.pyx", line 403, in spacy.syntax.nn_parser.Parser.parse_batch File "nn_parser.pyx", line 723, in spacy.syntax.nn_parser.Parser.get_batch_model TypeError: 'bool' object is not iterable

Your Environment

honnibal commented 6 years ago

@meshiguge The problem is the parser model isn't loaded, because you disable during load. You could call parser.from_disk() with the subdirectory, but the following should be easier:

nlp = spacy.load('en' )
parser = nlp.parser
disabled = nlp.disable_pipes('parser', 'ner',  'textcat') 

Btw you might want to still use pipe in the parser, like so: lines = ( line.strip() for line in open(file)) docs = nlp.pipe( lines, batch_size = ... ) match_docs = (doc for doc in docs if match_condition) for doc in parser.pipe(docs): do_something(doc_tag )

lock[bot] commented 6 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.