mandiant / GeoLogonalyzer

GeoLogonalyzer is a utility to analyze remote access logs for anomalies such as travel feasibility and data center sources.
Apache License 2.0
194 stars 59 forks source link

Unhandled exception when ASN names contain non-ASCII characters #2

Closed readshaw closed 6 years ago

readshaw commented 6 years ago

Thanks for developing this helpful tool.

When running with some of my own test data, I encountered an encoding issue with ASN names containing non-ASCII, UTF-8 encoded characters.

Here's a suggested fix to diff_dict_to_list() to address this issue:

def diff_dict_to_list(logon_diff_dict):
    """Convert logon_diff_dict to list for printing"""

    first_asn_name = logon_diff_dict.get("first_asn_name", "")
    if first_asn_name is None:
        first_asn_name = ""
    first_asn_name = first_asn_name.encode('utf-8').strip()

    second_asn_name = logon_diff_dict.get("second_asn_name", "")
    if second_asn_name is None:
        second_asn_name = ""
    second_asn_name = second_asn_name.encode('utf-8').strip()

    return ([str(logon_diff_dict.get("user", "")),
             str(logon_diff_dict.get("anomalies_string", "")),
             str(logon_diff_dict.get("first_time", "")),
             str(logon_diff_dict.get("first_ip", "")),
             str(logon_diff_dict.get("first_ip_dch_company", "")),
             str(logon_diff_dict.get("first_country", "")),
             str(logon_diff_dict.get("first_subdivision", "")),
             str(logon_diff_dict.get("first_location", "")),
             str(logon_diff_dict.get("first_asn_number", "")),
             first_asn_name,
             str(logon_diff_dict.get("first_client", "")),
             str(logon_diff_dict.get("first_hostname", "")),
             str(logon_diff_dict.get("first_streak", "")),
             str(logon_diff_dict.get("second_time", "")),
             str(logon_diff_dict.get("second_ip", "")),
             str(logon_diff_dict.get("second_ip_dch_company", "")),
             str(logon_diff_dict.get("second_country", "")),
             str(logon_diff_dict.get("second_subdivision", "")),
             str(logon_diff_dict.get("second_location", "")),
             str(logon_diff_dict.get("second_asn_number", "")),
             second_asn_name,
             str(logon_diff_dict.get("second_client", "")),
             str(logon_diff_dict.get("second_hostname", "")),
             str(logon_diff_dict.get("location_miles_diff", "")),
             str(logon_diff_dict.get("time_seconds_diff", "")),
             str(logon_diff_dict.get("miles_per_hour", ""))])
davidpany commented 6 years ago

Thank you for identifying this issue! I've implemented a fix in Version 1.01. Please let me know if you have any further issues with the new version.

bekirk commented 6 years ago

Thank you this fixed my issue too, I just realize it was a problem today but the code was updated from when I downloaded it!!! I was just about to submit a bug and realized this was the same problem I was having..