dsjava: fix json encoding of nested JavaMap

DenizenB commented 3 months ago

Hello there, many thanks for maintaining this awesome project.

I stumbled upon the following exception in the dsjava unit while trying to deserialize some data:

TypeError: keys must be str, int, float, bool or None, not JavaString

Turns out the python-javaobj package deserializes java.util.HashMap objects into dictionaries where the keys can be of type JavaString. This normally isn't noticeable for flat dicts thanks to this piece of code, but we run into issues with nested dicts.

The issue can be reproduced with the attached sample (listeners.zip) and the following pipeline:

$ r.ef listeners.zip | r.xt | r.dsjava -vv

This PR overrides encode and adds a preprocess method that recursively traverses dicts and converts any JavaString key into a plain str.

codecov[bot] commented 3 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 87.25%. Comparing base (9bd4661) to head (2ba226e). Report is 5 commits behind head on master.

:exclamation: Current head 2ba226e differs from pull request most recent head 7628fa1

Please upload reports for the commit 7628fa1 to get more accurate results.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #50 +/- ## ========================================== - Coverage 87.26% 87.25% -0.02% ========================================== Files 342 342 Lines 30568 30568 ========================================== - Hits 26675 26671 -4 - Misses 3893 3897 +4 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

huettenhain commented 3 months ago

Hey! Thanks a lot for this. Could you rebase your PR against f2d6eb74aa713d3fe0b38915bb748e850d69dca4? I added a test for your sample.

DenizenB commented 3 months ago

Ah, cool! I've gone ahead and rebased the PR on top of that commit.

huettenhain commented 3 months ago

Awesome. Now the tests should flicker back to green and then I'll merge this. I'll also push out a bugfix release after this so the fix will be in the latest version.

huettenhain commented 3 months ago

Alright that's on me. I have been doing something wrong in my GitHub actions config for a while now, it looks like it always checks out my HEAD instead of the PR source repo. I tried to fix this in 6bbb68519f91332e946ca8418cfddd5ae5c1cc9c, would you mind rebasing to that just one more time just to help me figure this out? I want to get the darn CICD working. 😭 I will merge this PR regardless, I know your fix works, I would just really like it if the automated testing would ... work.

DenizenB commented 3 months ago

No worries at all, just rebased onto the latest commit. I have no experience with GitHub Actions so can't offer any help, but I'm happy to rebase until it works.

huettenhain commented 3 months ago

I screwed up the syntax. If you wouldn't mind, could you rebase one last time against 06774829579d9e1d9392e6809cb80912664a2ecf? I swear I will just merge it after that regardless of whether that fixes the CICD or not xD

DenizenB commented 3 months ago

Another rebase is on its way, fingers crossed!

huettenhain commented 3 months ago

If you really don't mind, could you rebase once more? I added some debug logging since I don't understand why it won't check out your code.

DenizenB commented 3 months ago

Hmm so it looks like it ~triggers the event pull_request_target~?

I started reading a bit here, and it sounds like running tests on an external PR branch might be dangerous if the workflow contains secrets. So I guess it could expose the MALSHARE_API variable if someone submitted a malicious PR? Not sure what the best approach is though.

Edit: Perhaps the trigger should be changed from pull_request_target to pull_request? https://securitylab.github.com/research/github-actions-preventing-pwn-requests/

Edit 2: Although that also means that no secrets can be used

Due to the dangers inherent to automatic processing of PRs, GitHub’s standard pull_request workflow trigger by default prevents write permissions and secrets access to the target repository.

huettenhain commented 3 months ago

Alright, last time, this one should do it.

huettenhain commented 3 months ago

And yes, my solution to exactly that issue was pull_request_target. It's working now!

DenizenB commented 3 months ago

Sweet! I worry that this created a vulnerability though based on this article: https://securitylab.github.com/research/github-actions-preventing-pwn-requests/

TL;DR: Combining pull_request_target workflow trigger with an explicit checkout of an untrusted PR is a dangerous practice that may lead to repository compromise.

huettenhain commented 3 months ago

You're right, thanks for the link. My test config has been through what feels like countless iterations now and it seems annoyingly difficult to get this right.

huettenhain commented 3 months ago

Alright, deploying binary-refinery 0.6.41, that version will include this fix. Cheers!

binref / refinery

dsjava: fix json encoding of nested JavaMap #50

Codecov Report