UnicodeDecodeError - Githubissues

saulshanabrook commented 10 years ago

$ sudo python osxauditor.py -a -m -u -v -H log.html
DEBUG: Mac OS X Obj-C Foundation successfully imported
[INFO] Header
[INFO] Report generated by OS X Auditor v0.4.1 on 03/26/14 02:08:05 EDT running as 0/0
[INFO] Audited system path: /
[INFO] Version of the audited system: Mac OS X 10.9.2 build 13C64
[INFO] Current timezone of the audited system: America/New_York
...

[INFO] BAFEC7D3-F95B-4FA4-B52D-63DD8DDD19C7;414876165.0;com.google.Chrome.canary;Google Chrome Canary.app;https://clients2.googleusercontent.com/crx/blobs/QgAAAC6zw0qH2DJtnXe8Z7rUJP1LCBcQFXnsDBWtzc8TuHoP32t_g71nlolfDcqxMxX0MG6V426YJ_zSC0_gWDncJjfbtx0DIkVTho71ZSHVSqLsAMZSmuXi6_hEG2gwCDom6hkadLlag_KGbg/extension_37.crx;None;None;0;None;https://chrome.google.com/webstore/detail/pushbullet/chlffgpmiacpedhhbkiomidkjlcfhogd;None

[INFO] 478DBA8B-3223-4F43-875C-5FC5E67F7F37;414877643.0;com.google.Chrome.canary;Google Chrome Canary.app;https://mail-attachment.googleusercontent.com/attachment/b/476/u/0/?ui=2&ik=0e093b1064&view=att&th=14452100c25371f7&attid=0.1&disp=safe&zw&saduie=AG9B_P9aJIRk40xr9SD--gUNRcEm&sadet=1393184862873&sads=bdQ2PR7myal1bIzvrgaE6M02wiU;None;None;0;None;https://mail.google.com/mail/b/476/u/0/;None

Traceback (most recent call last):
  File "osxauditor.py", line 1666, in <module>
    Main()
  File "osxauditor.py", line 1621, in Main
    ParseQuarantines()
  File "osxauditor.py", line 437, in ParseQuarantines
    JointLSQuarantineEvent += u";" + unicode(Q)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb8 in position 4: ordinal not in range(128)

sroberts commented 10 years ago

@saulshanabrook Ahhh yes, I've hit that one myself. I see you've got quite the Python/Django background, any ideas?

marpaia commented 10 years ago

I've run into this during MIDAS development. I "solved" it by adding a function called "to_ascii" which would strip out non-ascii characters. It's messy and I'm not proud of it, but maybe it helps.

def to_ascii(s):
    """
    Returns the ascii representation of a given string
    """
    if type(s) == str:
        try:
            return s.encode("ascii", "replace")
        except:
            return None
    elif type(s) == dict:
        try:
            temp_dict = {}
            for k,v in s.iteritems():
                temp_dict[k] = str(v).encode("ascii", "replace")
            return temp_dict
        except:
            return None

jipegit / OSXAuditor

UnicodeDecodeError #14