mrdbourke / tensorflow-deep-learning

All course materials for the Zero to Mastery Deep Learning with TensorFlow course.
https://dbourke.link/ZTMTFcourse
MIT License
5.14k stars 2.53k forks source link

The Spacy Code in Skimlit Excersie is giving an error #438

Open arghanath007 opened 2 years ago

arghanath007 commented 2 years ago

The Code giving Error:

from spacy.lang.en import English
nlp = English() 
sentencizer = nlp.create_pipe("sentencizer") 
nlp.add_pipe(sentencizer) 
doc = nlp(example_abstracts[0]["abstract"]) 
abstract_lines = [str(sent) for sent in list(doc.sents)]
abstract_lines

This code is from the notebook link: https://github.com/mrdbourke/tensorflow-deep-learning/blob/main/09_SkimLit_nlp_milestone_project_2.ipynb

When Trying to make predictions on Custom RCT data.

The Error

image

ValueError                                Traceback (most recent call last)

[<ipython-input-285-931098a41489>](https://localhost:8080/#) in <module>
      3 nlp = English() # setup English sentence parser
      4 sentencizer = nlp.create_pipe("sentencizer") # create sentence splitting pipeline object
----> 5 nlp.add_pipe(sentencizer) # add sentence splitting pipeline object to sentence parser
      6 doc = nlp(example_abstracts[0]["abstract"]) # create "doc" of parsed sequences, change index for a different abstract
      7 abstract_lines = [str(sent) for sent in list(doc.sents)] # return detected sentences from doc in string type (not spaCy token type)

[/usr/local/lib/python3.7/dist-packages/spacy/language.py](https://localhost:8080/#) in add_pipe(self, factory_name, name, before, after, first, last, source, config, raw_config, validate)
    771             bad_val = repr(factory_name)
    772             err = Errors.E966.format(component=bad_val, name=name)
--> 773             raise ValueError(err)
    774         name = name if name is not None else factory_name
    775         if name in self.component_names:

ValueError: [E966] `nlp.add_pipe` now takes the string name of the registered component factory, not a callable component. Expected string, but got <spacy.pipeline.sentencizer.Sentencizer object at 0x7fe326b3ba00> (name: 'None').

- If you created your component with `nlp.create_pipe('name')`: remove nlp.create_pipe and call `nlp.add_pipe('name')` instead.

- If you passed in a component like `TextCategorizer()`: call `nlp.add_pipe` with the string name instead, e.g. `nlp.add_pipe('textcat')`.

- If you're using a custom component: Add the decorator `@Language.component` (for function components) or `@Language.factory` (for class components / factories) to your custom component and assign it a name, e.g. `@Language.component('your_name')`. You can then run `nlp.add_pipe('your_name')` to add it to the pipeline.

The code I wrote which works for predicting on custom RCT data:

from spacy.lang.en import English
nlp= English()
nlp.add_pipe("sentencizer")
doc= nlp(example_abstracts[0]["abstract"])
abstract_lines= [str(sent) for sent in list(doc.sents)]
abstract_lines

image