Fix checking skiplist when processing fields subject to field_order.

JadingTsunami commented 1 year ago

I found I was getting some erroneous duplicated fields when decoding and re-encoding a protobuf, even without making any changes to the content. I discovered this was because when processing fields with a field_order, the skiplist was not being used. Also, when a field fails to encode, I believe we want to add it to the skiplist. This PR addresses both those concerns.

rwinkelmaier-ncc commented 10 months ago

Hi! Thanks for submitting this. Sorry for the delayed response, had missed the notification from github.

I'm trying to refresh my memory on the reason for the field_order/skiplist (maybe could have used more comments here). If I remember correctly, the idea was to try to preserve message ordering so that we could try to get the same exact bytes out if we decode then re-encode. The skiplist was there so that the bottom _encode_message_field called doesn't add anything that was already put into output in field_order.

I think we don't want to pass skiplist into the first _encode_message_field because the skiplist shouldn't contain anything that we're trying to encode with _encode_message_field.

Similarly, I think we don't want to add fields to the skiplist if the field hasn't been added added to the output yet. That might end up with some silent errors when fields don't get encoded correctly. I don't remember off the top of my head why we catch the exception at that spot. Maybe an edge case with packed fields.

he root cause of your bug might be lurking elsewhere in the code. Would you be able to provide a base64 input example which triggers duplicate fields? Were the duplicate fields in the encoded protobuf or in JSON? If JSON, was it multiple fields with the same name/key or a single name/key with an array of values?

Additionally, did these changes fix the issue you ran into? If so, it might be an issue with how the field_order is generated, or maybe with how indexes in skiplist and/or field_order are handled.

The code for handling this portion is definitely a little clumsy and I have some ideas for improvement.

JadingTsunami commented 10 months ago

Hmm, interesting. Yes, I got the purpose of the two but it seems there are several issues with the current implementation.

Arrays (i.e., repeated elements) are not handled correctly; the first element is encoded and then subsequent elements are encoded at the end of the enclosing block.
Duplicates can appear for the reason in this PR
There are other (additional) reasons for duplicates but they are not addressed by this PR

Yes, the PR here fixed the issue I was seeing and for the reason I gave in the initial write-up.

Perhaps the generated field_order should contain every index seen 1:1 with the original input, instead of just the first index.

rwinkelmaier-ncc commented 10 months ago

Arrays (i.e., repeated elements) are not handled correctly; the first element is encoded and then subsequent elements are encoded at the end of the enclosing block.

Perhaps the generated field_order should contain every index seen 1:1 with the original input, instead of just the first index.

The field_order list should contain every index from the original input. It is a list of tuples which contains both the field number and the index of a specific element. If the original parsed message is repeated, then field_order will contain multiple items that have the same field number but a different index, for each element in the original message.

For example:

output message: {'1': {'1': [1, 2, 3], '2': [1, 2, 3, 4]}}

output typedef: {'1': {'type': 'message', 'message_typedef': {'1': {'type': 'int'}, '2': {'type': 'int'}}, 'field_order': [('1', 0), ('1', 1), ('1', 2), ('2', 0), ('2', 1), ('2', 2), ('2', 3)]}}

The code which generates the field_order list is at:

https://github.com/nccgroup/blackboxprotobuf/blob/c5fdc580735cead4b17dbf13a4481360b14247e6/lib/blackboxprotobuf/lib/types/length_delim.py#L427C81-L427C81

Do you have an example you can provide ? I've been trying some different cases with repeated and packed types in different orders, but haven't been able to reproduce any duplicate fields.

I want to clean up the feature either way, but would like to add your case as a test to make sure the cleanup addresses the issue.

If you don't have a test case handy, could you provide the following:

Are you using blackbox protobuf as a python library (if so, python2 or python3?) or through the Burp extension?
Where are you seeing the duplicated fields, are they appearing in the decode_message output, JSON output, or are you inspecting the protobuf binary itself?
Are the duplicate fields presenting as extra entries in the dictionary with the same field number key or is it extra elements being added to the array?
What is the type of the fields you are seeing duplicated?

JadingTsunami commented 10 months ago

Hmm, I definitely don't see that behavior. But I'm dealing with a heavily-nested protobuf. It's part of a video game data file which I've been using this library to edit. From what I can tell, the game uses the Google protobuf library to decode, so I expect the protobuf is well-formed. All I'm doing is decoding and re-encoding, and in the resulting binary, there are many duplicated and out-of-order entries. The duplication is traceable to the code I patched here. The out-of-order issues are traceable to the array handling. If, by some chance, you have the game Streets of Rage 4, this is a file called the "bigfile" in the game's data. :)

To the specific questions:

Used as a python3 library
I'm looking in the resulting binary. I just use the decode/encode directly as a test case.
I believe the duped fields are added during the encode phase and are not present in the dictionary itself.
Always Messages, though I admit I am not 100% certain that it's exclusive to that.

Hope that helps. I don't know how easily I'll be able to come up with a shortened test case as the protobufs I'm dealing with are quite expansive.

rwinkelmaier-ncc commented 10 months ago

I just pushed to a set of changes to how field_order is used/generated which should be a bit more robust and eliminate any risk of duplicate encodings. It's in a branch right now at https://github.com/nccgroup/blackboxprotobuf/tree/fix_field_order. It passes the python3 tests I have, but I need to get the python2/jython tests working again.

I also added a config flag to skip field ordering and it will just encode everything with the same field number together, but probably in an arbitrary order. The risk is that if you're protobuf message has a string or byte array that just happens to be valid protobuf, but isn't meant to be protobuf, it might end up getting scrambled when it gets re-encoded.

I suspect you're issue might lie somewhere elsewhere though. I reviewed the skiplist handling a bit closer, and when a selected_index was passed in the skiplist was ignored. So I think passing the skiplist to the first encoding function wouldn't have changed any behavior. I think the second change of adding to the skiplist in the case of an exception could suppress errors during encoding and would lose whole fields present in the dictionary.

Hopefully, if this doesn't fix this issue, it might give a better idea of where the root cause is.

JadingTsunami commented 10 months ago

I just pushed to a set of changes to how field_order is used/generated which should be a bit more robust and eliminate any risk of duplicate encodings. It's in a branch right now at https://github.com/nccgroup/blackboxprotobuf/tree/fix_field_order. It passes the python3 tests I have, but I need to get the python2/jython tests working again.

Thanks, I'll check this out soon.

I suspect you're issue might lie somewhere elsewhere though. I reviewed the skiplist handling a bit closer, and when a selected_index was passed in the skiplist was ignored. So I think passing the skiplist to the first encoding function wouldn't have changed any behavior.

Right, I had a separate change to address this by handling arrays all-at-once, but didn't submit that as a PR because it's not a desirable change in general. I think the new approach you have here is probably better anyway.

I think the second change of adding to the skiplist in the case of an exception could suppress errors during encoding and would lose whole fields present in the dictionary.

Yeah, the new approach in the branch you have seems much cleaner to me.

Thanks for looking into this.

JadingTsunami commented 10 months ago

Hmm, with the proposed branch I'm unable to export any protobufs anymore without encountering KeyErrors, even without making any modifications. I'll have to debug it a bit more, but this used to work before.

rwinkelmaier-ncc commented 10 months ago

Let me know if you find where that's being raised from. KeyError's have been a headache to pin down in the past.

I see one case in the new code that could key error if a field is removed, but I don't think you should hit that case if the message isn't modified. That should be an easy fix by catching KeyError there as well as IndexError though. I'll push that change onto the branch.

https://github.com/nccgroup/blackboxprotobuf/blob/fix_field_order/lib/blackboxprotobuf/lib/types/length_delim.py#L143C42-L143C42

I also made some tweaks to how the field_number was generated in _encode_message_field. I don't think that's used in an unchecked way that would trigger a KeyError, but it's possible I missed somewhere.

JadingTsunami commented 10 months ago

Looks like the KeyError bug occurs when a repeated protobuf has optional fields.

Here is a minimal hex example:

AA07061A 040A020A 00AA0736 0A340A32 0A307878 78787878 78787878 78787878 78787878 78787878 78787878 78787878 78787878 78787878 78787878 78787878 7878

rwinkelmaier-ncc commented 10 months ago

Are you still getting the KeyError with the latest update on the branch?

I'm not getting the KeyError, but I am getting the warning that I placed in there. I'll probably have to drop the warning or reduce it to a debug though because it's probably going to be hit often.

This is a good edge case I forgot about where field re-ordering fails because it is attached to the typedef instead of in the output dictionary itself. It tries to apply the same field ordering to every repeated message instead of each message having it's own ordering. But given that field reordering is only for specific edge cases, and this would affect only a small subset of those, I didn't want to pollute the main message field with field ordering for every message.

JadingTsunami commented 10 months ago

Are you still getting the KeyError with the latest update on the branch?

Nope, wanted to change only one thing at a time. On the latest branch, I don't see this error anymore. I do get the warning you mentioned though.

Thanks, I think this branch resolves the issues.

rwinkelmaier-ncc commented 10 months ago

Awesome! Thanks for sticking with me through the debugging process. I'll try to merge the changes into the main branch soon and put a new version on python, but want to put it through python2 tests before then. Everything is dropping python2 support, so my testing suite has fallen apart.

I'll throw a comment on here when that's done.

rwinkelmaier-ncc commented 8 months ago

Sorry for the delay, but got python2 tests fixed and fixed a few other bugs that popped up updating some of the dependencies.

Changes are now in the main branch and published on pypi: https://pypi.org/project/bbpb/

nccgroup / blackboxprotobuf

Fix checking skiplist when processing fields subject to field_order. #21