Closed weichsl closed 11 years ago
In helping you to troubleshoot this, I have a few question:
Before we get too much further, I just want to make sure that these two questions have reasonable answers. Thanks!
Thank you very much :)
Here is what I pass as arguments on the command line: "C:\Users\Christian\Desktop\enron.mbox" "C:\Users\Christian\Desktop\enron.json"
And I also verified that the enron.mbox file is really an mbox file. I downloaded it from where you referenced to:
http://zaffra.com/static/matthew/enron.mbox.gz
Usually I use eclipse for experimenting with python. But here is what I got on my commandline:
C:\Users\Christian\workspace\Mail Datamining>python mailboxesjsonify_mbox.py "
C:\Users\Christian\Desktop\enron.mbox" "C:\Users\Christian\Desktop\enron.json"
Traceback (most recent call last):
File "mailboxes__jsonify_mbox.py", line 89, in
Can you confirm that you have the latest version of the script from https://github.com/ptwobrussell/Mining-the-Social-Web/blob/master/python_code/mailboxes__jsonify_mbox.py ? It looks like you are using an earlier version. Out of curiosity, where did you get this version? Did you copy it out of the book line by line, or was this from a previous download of the GitHub archive and perhaps you just didn't update it in a while?
I'm using the latest version of the script. I'm in sync with your github repository.
Hmm. Your stack trace doesn't match the link to the latest file I posted though. See what I mean?
I see. I might have used another version. Now when using the most recent script following error occurs:
mbox = mailbox.UnixMailbox(open(MBOX, 'rb'), email.message_from_file)
^
When correcting the indentation level:
Traceback (most recent call last):
File "C:\Users\Christian\Documents\GitHub\Mining-the-Social-Web\python_code\mailboxesjsonify_mbox.py", line 73, in
I'm sorry about that indentation error. I think it must have been introduced through a pull request that I accepted a while back, and I haven't run the code myself since then, so it went unnoticed.
Back to your issue - I just figured out what is going on. I was originally developing with Python 2.6 and was trying to use json2 as an import to speed up serialization into JSON, which worked fine. Then I got a pull request to use a generator, which was also a great idea...except that when you use the default json package that comes with Python 2.7, it no longer is able to actually serialize what the generator is producing....hence, the need to patch this.
Thank you for this feedback. It was very helpful, and I'm glad we got this sorted out. I hope it didn't cause you too much trouble.
Thx für resolving this problem so quick. I really appreciate this!
When executing the provided script, I encounter following error:
Traceback (most recent call last): File "C:\Users\Christian\workspace\Mail Datamining\mailboxes__jsonify_mbox.py", line 89, in
json.dump(json_msgs,open(OUT_FILE, 'wb'), indent=4)
File "C:\Python27\lib\jsoninit.py", line 181, in dump
for chunk in iterable:
File "C:\Python27\lib\json\encoder.py", line 436, in _iterencode
o = _default(o)
File "C:\Python27\lib\json\encoder.py", line 178, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <generator object gen_json_msgs at 0x024BFB98> is not JSON serializable
I'm new to Python, but it seems that the code is trying to serialize the function itself instead of the objects it returns.
Thank you very much, for any help!!