ptwobrussell / Mining-the-Social-Web

The official online compendium for Mining the Social Web (O'Reilly, 2011)
http://bit.ly/135dHfs
Other
1.21k stars 491 forks source link

the_tweet__count_entities_in_tweets.py dose not work (example 5-4). #63

Closed dagsonpatrick closed 11 years ago

dagsonpatrick commented 11 years ago

hello people   I'm in error on line 80 in the stack shown below   will be that would lack any module installed in the canopy?


ServerError Traceback (most recent call last) C:\Users\Dagson\AppData\Local\Enthought\Canopy32\App\appdata\canopy-1.0.3.1262.win-x86\lib\site-packages\IPython\utils\py3compat.pyc in execfile(fname, glob, loc) 174 else: 175 filename = fname --> 176 exec compile(scripttext, filename, 'exec') in glob, loc 177 else: 178 def execfile(fname, *where):

C:\Users\Dagson\Desktop\TCC-Pratica\the_tweet__count_entities_in_tweets.py in () 78 79 entities_freqs = sorted([(row.key, row.value) for row in erro here ---> 80 db.view('index/entity_count_by_doc', group=True)], 81 key=lambda x: x[1], reverse=True) 82

C:\Users\Dagson\AppData\Local\Enthought\Canopy32\User\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\client.pyc in iter(self) 1155 1156 def iter(self): -> 1157 return iter(self.rows) 1158 1159 def len(self):

C:\Users\Dagson\AppData\Local\Enthought\Canopy32\User\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\client.pyc in rows(self) 1174 """ 1175 if self._rows is None: -> 1176 self._fetch() 1177 return self._rows 1178

C:\Users\Dagson\AppData\Local\Enthought\Canopy32\User\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\client.pyc in _fetch(self) 1161 1162 def _fetch(self): -> 1163 data = self.view._exec(self.options) 1164 wrapper = self.view.wrapper or Row 1165 self._rows = [wrapper(row) for row in data['rows']]

C:\Users\Dagson\AppData\Local\Enthought\Canopy32\User\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\client.pyc in _exec(self, options) 1027 1028 def exec(self, options): -> 1029 , _, data = _call_viewlike(self.resource, options) 1030 return data 1031

C:\Users\Dagson\AppData\Local\Enthought\Canopy32\User\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\client.pyc in _call_viewlike(resource, options) 1086 return resource.post_json(body=keys, _encode_view_options(options)) 1087 else: -> 1088 return resource.get_json(_encode_view_options(options)) 1089 1090

C:\Users\Dagson\AppData\Local\Enthought\Canopy32\User\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\http.pyc in get_json(self, path, headers, _params) 521 522 def get_json(self, path=None, headers=None, _params): --> 523 return self._request_json('GET', path, headers=headers, _params) 524 525 def post_json(self, path=None, body=None, headers=None, _params):

C:\Users\Dagson\AppData\Local\Enthought\Canopy32\User\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\http.pyc in _request_json(self, method, path, body, headers, _params) 544 def _request_json(self, method, path=None, body=None, headers=None, _params): 545 status, headers, data = self._request(method, path, body=body, --> 546 headers=headers, **params) 547 if 'application/json' in headers.get('content-type'): 548 data = json.decode(data.read())

C:\Users\Dagson\AppData\Local\Enthought\Canopy32\User\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\http.pyc in _request(self, method, path, body, headers, _params) 540 return self.session.request(method, url, body=body, 541 headers=all_headers, --> 542 credentials=self.credentials) 543 544 def _request_json(self, method, path=None, body=None, headers=None, _params):

C:\Users\Dagson\AppData\Local\Enthought\Canopy32\User\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\http.pyc in request(self, method, url, body, headers, credentials, num_redirects) 396 raise PreconditionFailed(error) 397 else: --> 398 raise ServerError((status, error)) 399 400 # Store cachable responses

ServerError: (500, (u'EXIT', u'{{badmatch,[]},\n [{couch_query_servers,new_process,3},\n {couch_query_servers,lang_proc,3},\n {couch_query_servers,handle_call,3},\n {gen_server,handle_msg,5},\n {proc_lib,init_p_do_apply,3}]}'))

dagsonpatrick commented 11 years ago

I bought the book Data Mining the social web, to assist me in collecting data for the conclusion of my work course i'm brazilian and i'm system of information student

thanks

ptwobrussell commented 11 years ago

@dagsonpatrick - I'd be glad to try and help you in any way that I can, although I don't have an environment setup with Canopy on it and have never used it myself.

My guess from looking at your stack is that it is more likely that you have a configuration problem with CouchDB trying to execute a Python view than anything else. Did you modify the CouchDB "couchpy" config file as explained on/around page 53 (of the English translation if that is the one you are using)? The gist is to insert the line [query_servers] python = /path/to/couchpy into your couchpy configuration file? I'm not sure about Canopy, but I think ActivePython on Windows puts this file at C:\PythonXY\Scripts (where XY is the version of Python you are using.)

ptwobrussell commented 11 years ago

Also, you might benefit from just knowing that a 2nd Edition of the book is just around the corner and a lot of work has gone into making it (and its code) better than ever. You might enjoy browsing its source code repository to check it out and see if you'd prefer working with it in addition to the 1st Edition's code:

https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition

dagsonpatrick commented 11 years ago

Thanks ptwobrussell

I started using the ActivePython2.7 now.

did easy_install couchdb the terminal and everything went right.

My CouchDB is in two files, one called default.ini and another called local.ini, located in the directory C: \ Program Files (x86) \ Apache Software Foundation \ CouchDB \ etc \ couchdb.

insert the following line in the two files as the book explains [query_servers] python = / C :/ Python27/Scripts/couchpy

still keeps giving this error below

Traceback (most recent call last): File "C:\Users\Dagson\Desktop\TCC-Pratica\the_tweet__count_entities_in_tweets.py", line 80, in db.view('index/entity_count_by_doc', group=True)], File "C:\Python27\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\client.py", line 1157, in iter return iter(self.rows) File "C:\Python27\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\client.py", line 1176, in rows self._fetch() File "C:\Python27\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\client.py", line 1163, in _fetch data = self.view._exec(self.options) File "C:\Python27\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\client.py", line 1029, in exec , _, data = _call_viewlike(self.resource, options) File "C:\Python27\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\client.py", line 1088, in _call_viewlike return resource.get_json(__encode_view_options(options)) File "C:\Python27\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\http.py", line 523, in get_json return self._request_json('GET', path, headers=headers, _params) File "C:\Python27\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\http.py", line 546, in _request_json headers=headers, **params) File "C:\Python27\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\http.py", line 542, in _request credentials=self.credentials) File "C:\Python27\lib\site-packages\couchdb-0.9-py2.7.egg\couchdb\http.py", line 398, in request raise ServerError((status, error)) ServerError: (500, (u'EXIT', u'{{badmatch,[]},\n [{couch_query_servers,new_process,3},\n {couch_query_servers,lang_proc,3},\n {couch_query_servers,handle_call,3},\n {gen_server,handle_msg,5},\n {proc_lib,init_p_do_apply,3}]}'))

ptwobrussell commented 11 years ago

This is not an easy thing to debug through back and forth messages, but let's try to work through it. Let me first just ask you a few simple diagnostic questions:

I really think your issue has to do with your configuration and we'll narrow it down. I'm sorry that you are having troubles with this. I know it's a pain and will do whatever I can to help you through it.

dagsonpatrick commented 11 years ago

Answering your questions diagnostics:

• Access http://localhost:5984/_utils perfectly in my browser and visualize the web interface with CouchDB databases already created with documents stored.

• Passing argument to sys.argv [1] was made ​​as the line below DB = 'tweets-user-timeline-dagsonmg' server = couchdb.Server('http://localhost:5984') db = server[DB] FREQ_THRESHOLD = 3

Ref.: https://github.com/ptwobrussell/Mining-the-Social-Web/blob/master/python_code/the_tweet__count_entities_in_tweets.py

• I confirm that the database exists and has data that were generated from the previous year. • Ran previous years with CouchDB until then only had this problem. • Value modifier to local.ini and default.ini [query_servers] python = C:\Python27\Scripts\couchpy • couchpy.exe When I type in my command prompt, the prompt on couchpy.exe for the header prompt window and waits for some input. If I press "Enter without typing anything" displays the following message,no handlers could be found for logger "couchdb.view",may be some mistake?

I appreciate your understanding and cooperation until now, I will proceed in the exercise book.

ptwobrussell commented 11 years ago

So to be clear, you have successfullly run Example 5-3 and other examples that have employed a ViewDefinition that allows you to use Python to query CouchDB? That's the main thing I am interested in knowing. It sounds like the answer is that you have.

Assuming that's the case, your configuration and everything should be ok, and the problem is likely in that the code in the entityCountMapper is somehow triggering an error that is not handled well by CouchDB, and hence, the arcane message. What I would recommend at this point is a couple of things: 1) try wrapping some of the logic in the entityCountMapper in a try/except block and ensuring that if any error happens that you can figure out what it is. Check the CouchDB logs to see if you are able to log anything by writing to sys.stdout or sys.stderr. And 2) you could also isolate a very little bit of the data that you are querying here (as few as just a couple of tweets worth of data versus hundreds or thousands of them) and try to run the example. Assuming it runs ok, it probably means that there is a problem that's data related.

Are you able to try out either of these ideas?

If you are able to somehow provide your data to me or give me the exact query you are running, I could try to reproduce this problem, although there is no guarantee I'd be able to reproduce it (which might also be helpful to discover as a step in debugging.)

dagsonpatrick commented 11 years ago

Hello ptwobrussel My problem report that this issue has been resolved, I woke up this morning to head cool and made the following changes in the file default.ini [query_servers] python = C:\Python27\Scripts\couchpy

After I turned off the computer and turned on again, CouchDB booted and ran the code from example 5.4 of the book, then run the code normally.

Thanks for your help

ptwobrussell commented 11 years ago

That's interesting. I'm glad that it worked out for you. In the end, the fix amounted to just restarting the services?

Let me know if there's anything else you need along the way.

andreedu commented 10 years ago

This tip "python = C:\Python27\Scripts\couchpy" save my day. Thanks a lot!

ptwobrussell commented 10 years ago

@andreedu - Glad that this thread helped. Definitely check out the 2nd Edition's source code, though. MongoDB has turned out to be a really great choice that's a lot simpler to work with overall.

You can preview the source code from here as viewable IPython Notebooks - https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition

I'd also highly encourage you to look at Appendix A - http://nbviewer.ipython.org/urls/raw.github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/_Appendix%20A%20-%20Virtual%20Machine%20Experience.ipynb