KimiNewt / pyshark

Python wrapper for tshark, allowing python packet parsing using wireshark dissectors
MIT License
2.19k stars 422 forks source link

duplicate keys in JSON objects are lost #695

Open dkg opened 3 months ago

dkg commented 3 months ago

in tshark.py, _duplicate_object_hook is defined and declared to "Make lists out of duplicate keys". However, it doesn't make lists out of duplicate keys:

def _duplicate_object_hook(ordered_pairs):
    """Make lists out of duplicate keys."""
    json_dict = {}
    for key, val in ordered_pairs:
        existing_val = json_dict.get(key)
        if not existing_val:
            json_dict[key] = val
        else:
            # There are duplicates without any data for some reason, if it's that - drop it
            # Otherwise, override
            if val.get("properties") != {}:
                json_dict[key] = val

    return json_dict

This was introduced in 862e4a6f780805b979caff9ccd20a2fdc5dc5b0b -- which doesn't have enough in the git commit message to explain why it is done this way.

dkg commented 3 months ago

Note that p.3 of the json standard says:

The JSON syntax does not impose any restrictions on the strings used as names, does not require that name strings be unique, and does not assign any significance to the ordering of name/value pairs. These are all semantic considerations that may be defined by JSON processors or in specifications defining specific uses of JSON for data interchange