ptwobrussell / Mining-the-Social-Web-2nd-Edition

The official online compendium for Mining the Social Web, 2nd Edition (O'Reilly, 2013)
http://bit.ly/135dHfs
Other
2.9k stars 1.48k forks source link

Ch9 Ex24, getting error #202

Open richsiemers opened 10 years ago

richsiemers commented 10 years ago

Recv error in running ch 9: ex 24> Any thoughts. err listed below Thanks


LookupError Traceback (most recent call last)

in () 116 117 sample_url = 'http://radar.oreilly.com/2013/06/phishing-in-facebooks-pond.html' --> 118 summary = summarize(url=sample_url) 119 120 # Alternatively, you can pass in HTML if you have it. Sometimes this approach may be in summarize(url, html, n, cluster_threshold, top_sentences) 80 txt = extractor.getText() 81 ---> 82 sentences = [s for s in nltk.tokenize.sent_tokenize(txt)] 83 normalized_sentences = [s.lower() for s in sentences] 84 /usr/local/lib/python2.7/dist-packages/nltk/tokenize/**init**.pyc in sent_tokenize(text) 73 (currently :class:`.PunktSentenceTokenizer`). 74 """ ---> 75 tokenizer = load('tokenizers/punkt/english.pickle') 76 return tokenizer.tokenize(text) 77 /usr/local/lib/python2.7/dist-packages/nltk/data.pyc in load(resource_url, format, cache, verbose, logic_parser, fstruct_parser) 603 # Load the resource. 604 if format == 'pickle': --> 605 resource_val = pickle.load(_open(resource_url)) 606 elif format == 'yaml': 607 import yaml /usr/local/lib/python2.7/dist-packages/nltk/data.pyc in _open(resource_url) 684 685 if protocol is None or protocol.lower() == 'nltk': --> 686 return find(path).open() 687 elif protocol.lower() == 'file': 688 # urllib might not use mode='rb', so handle this one ourselves: /usr/local/lib/python2.7/dist-packages/nltk/data.pyc in find(resource_name) 465 sep = '_'_70 466 resource_not_found = '\n%s\n%s\n%s' % (sep, msg, sep) --> 467 raise LookupError(resource_not_found) 468 469 def retrieve(resource_url, filename=None, verbose=True): LookupError: --- Resource 'tokenizers/punkt/english.pickle' not found. Please use the NLTK Downloader to obtain the resource: >>> nltk.download() Searched in: - '/root/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data' ---
ptwobrussell commented 10 years ago

Are you running the code in the VM that's provided or on your machine in a Python interpreter?

In either case, can you try running nltk.download() (if in an IPython Notebook, do it in a new cell.)

richsiemers commented 10 years ago

Running thru notebook Ran other examples in ch 9 without issue

Will try the download as you suggest

Sent from my Verizon Wireless 4GLTE smartphone

----- Reply message ----- From: "Matthew A. Russell" notifications@github.com To: "ptwobrussell/Mining-the-Social-Web-2nd-Edition" Mining-the-Social-Web-2nd-Edition@noreply.github.com Cc: "richsiemers" richsiemers1234@gmail.com Subject: [Mining-the-Social-Web-2nd-Edition] Ch9 Ex24, getting error (#202) Date: Mon, Jun 9, 2014 5:10 pm Are you running the code in the VM that's provided or on your machine in a Python interpreter?

In either case, can you try running nltk.download() (if in an IPython Notebook, do it in a new cell.)

— Reply to this email directly or view it on GitHub.

ptwobrussell commented 10 years ago

If you are working in ipynb, there should be a non-numbered cell that you can execute to trigger the download of various NLTK dependencies in a notebook from a previous chapter (5 or 6). It is a minor omission that it isn't replicated in 9 for folks who haven't worked through chapter by chapter. I'll make a note to add that in

On Jun 9, 2014, at 5:52 PM, richsiemers notifications@github.com wrote:

Running thru notebook Ran other examples in ch 9 without issue

Will try the download as you suggest

Sent from my Verizon Wireless 4GLTE smartphone

----- Reply message ----- From: "Matthew A. Russell" notifications@github.com To: "ptwobrussell/Mining-the-Social-Web-2nd-Edition" Mining-the-Social-Web-2nd-Edition@noreply.github.com Cc: "richsiemers" richsiemers1234@gmail.com Subject: [Mining-the-Social-Web-2nd-Edition] Ch9 Ex24, getting error (#202) Date: Mon, Jun 9, 2014 5:10 pm Are you running the code in the VM that's provided or on your machine in a Python interpreter?

In either case, can you try running nltk.download() (if in an IPython Notebook, do it in a new cell.)

— Reply to this email directly or view it on GitHub. — Reply to this email directly or view it on GitHub.

richsiemers commented 10 years ago

did the download as suggested and example worked. thanks for the input

Regards

Sent from Windows Mail

From: Matthew A. Russell Sent: ‎Monday‎, ‎June‎ ‎9‎, ‎2014 ‎7‎:‎40‎ ‎PM To: ptwobrussell/Mining-the-Social-Web-2nd-Edition Cc: richsiemers

If you are working in ipynb, there should be a non-numbered cell that you can execute to trigger the download of various NLTK dependencies in a notebook from a previous chapter (5 or 6). It is a minor omission that it isn't replicated in 9 for folks who haven't worked through chapter by chapter. I'll make a note to add that in

On Jun 9, 2014, at 5:52 PM, richsiemers notifications@github.com wrote:

Running thru notebook Ran other examples in ch 9 without issue

Will try the download as you suggest

Sent from my Verizon Wireless 4GLTE smartphone

----- Reply message ----- From: "Matthew A. Russell" notifications@github.com To: "ptwobrussell/Mining-the-Social-Web-2nd-Edition" Mining-the-Social-Web-2nd-Edition@noreply.github.com Cc: "richsiemers" richsiemers1234@gmail.com Subject: [Mining-the-Social-Web-2nd-Edition] Ch9 Ex24, getting error (#202) Date: Mon, Jun 9, 2014 5:10 pm Are you running the code in the VM that's provided or on your machine in a Python interpreter?

In either case, can you try running nltk.download() (if in an IPython Notebook, do it in a new cell.)

— Reply to this email directly or view it on GitHub. — Reply to this email directly or view it on GitHub. — Reply to this email directly or view it on GitHub.

petermc129 commented 10 years ago

I encountered the same problem since I went from Chapter 1 to Chapter 9. I should have checked the forum a few hours ago. A note in the instructions to use nltk.download() would have been helpful.