Open ThinkMaize opened 8 years ago
Hey!
It doesn't look to me that the issue is that the file is too large. Have you tried these tricks? https://stackoverflow.com/questions/4923509/python-decode-strings
Or maybe could you share the string? (pastebin or similar)
I think it might be the ":" in the filename. Looks like it's allowed in gmail, but not on Windows.
I don't think it's an issue with the filename at all, as ":" works just fine in the file above the one that caused the error, and ":" (while not technically allowed on Windows, I guess) actually seems to work based on a quick test I just did -- I can't create a file with ":" in explorer, but it works fine elsewhere.
Base64 padding is sequence of 0 or more =
s at the end of a Base64 string. Those can actually be omitted because they don't hold any real value; they just make it "easier" to break the base64 encoded string into byte-sized chunks (bad pun :disappointed:). But a2b_base64()
seems to require them, or it throws an error.
Putting this right above the call to decode might fix it:
content += '=' * (-len(content) % 4)
I am trying to use this scrip to extract all of attachments from an MBOX, file, and I keep receiving an error. I can't seem to figure out what is triggering the error, so I was just wondering if it might be that my MBOX file is too large. It's about 17gb. Here's my command prompt output (your script is mbox.py):
C:\Users\Alex\Desktop\Email Attachments>python mbox.py mail.mbox Extract attachments from mbox files Copyright (C) 2012 Pablo Castellano This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain cond itions.
Attachment found! Extracting down_arrow.png (2799 bytes)
Attachment found! Extracting profilephoto.png (1286 bytes)
Attachment found! Extracting windows.png (1692 bytes)
Attachment found! Extracting keyhole.png (4422 bytes)
Attachment found! Extracting google_logo.png (12199 bytes)
Attachment found! Extracting 2015_04_23_10:48.csv (3660265 bytes)
Attachment found! Extracting 2015_04_23_11:18.csv (3660737 bytes)
Attachment found! Extracting 2015_04_23_11:48.csv (3661510 bytes)
Attachment found! Traceback (most recent call last): File "mbox.py", line 164, in
extract_attachment(payl)
File "mbox.py", line 77, in extract_attachment
content = base64.decodestring(content)
File "C:\Python27\lib\base64.py", line 321, in decodestring
return binascii.a2b_base64(s)
binascii.Error: Incorrect padding
C:\Users\Alex\Desktop\Email Attachments>