Open mauuuuu5 opened 1 year ago
Hellos, thanks for writing. Yes, I put some selected code to the repo, not all of the book. Would it be possible to send a screenshot or just copy paste the output? This is a small code fragment, should work rather trouble free.
Hi thank you for the reply, this is an image of the output
Cheers
Hi everyone I am copying from the book the code that takes the NY times article but I cannot get the book's output and also the code is not in chapter 03
from bs4 import BeautifulSoup
import requests
import spacy
def url_text(url_string):
res = requests.get(url_string)
html = res.text
soup = BeautifulSoup(html, 'html5lib')
for script in soup(["script", "style", 'aside']):
script.extract()
text = soup.get_text()
return " ".join(text.split())
ny_art = url_text("https://www.nytimes.com/2021/01/12/opinion/trump-america-allies.html")
nlp = spacy.load("en_core_web_md")
doc = nlp(ny_art)
len(doc.ents)
from collections import Counter
labels = [ent.label_ for ent in doc.ents]
Counter(labels)
Thank you