great-expectations / great_expectations

Always know what to expect from your data.
https://docs.greatexpectations.io/
Apache License 2.0
9.92k stars 1.54k forks source link

Malicious vulnerability with GuardDog #7802

Closed lucasmengual92 closed 1 month ago

lucasmengual92 commented 1 year ago

Describe the bug These past days I was running for malicious checks on the open-source and 3rd-party packages using the GuardDog python library from DataDog, and 2 issues come up when scanning the Great Expectations package. I'm really in love with the GE package, but also I have to comply to some cybersecurity and malicious check rules, so that's why it will be great if these issues are fixed, most likely other individuals and organizations will come up to this and feel less confident on the GE package.

  1. First, in the default_expectation_configuration_builder.py script in the line of code return eval("".join([str(t) for t in binary_list])) This package contains a call to the eval function with a base64 encoded string as argument. This is a common method used to hide a malicious payload in a module as static analysis will not decode the string.

  2. The other is a repository integrity mismatch which alerts that some files present in the package (great_expectations version 0.16.8) are different from the ones on GitHub for the same version of the package, and that file is great_expectations/_version.py.

To Reproduce Steps to reproduce the behavior:

  1. Install guarddog (pip install guarddog) with a Python env version of 3.10 or higher, and run the following code:
    
    from guarddog import PypiPackageScanner

scanner = PypiPackageScanner() report = scanner.scan_remote("great_expectations", "0.16.8")

print(report)

2. The print shall show a dictionary and with 2 issues, as mentioned above. Here's how the returned dictionary looks like:

{'issues': 2, 'errors': {}, 'results': {'repository_integrity_mismatch': 'Some files present in the package are different from the ones on GitHub for the same version of the package: \n* great_expectations/_version.py', 'shady-links': {}, 'npm-exec-base64': {}, 'exec-base64': [{'location': 'great_expectations-0.16.8/great_expectations/rule_based_profiler/expectation_configuration_builder/default_expectation_configuration_builder.py:324', 'code': ' return eval("".join([str(t) for t in binary_list]))', 'message': 'This package contains a call to the eval function with a base64 encoded string as argument.\nThis is a common method used to hide a malicious payload in a module as static analysis will not decode the\nstring.\n'}], 'exfiltrate-sensitive-data': {}, 'npm-serialize-environment': {}, 'silent-process-execution': {}, 'cmd-overwrite': {}, 'download-executable': {}, 'obfuscation': {}, 'npm-silent-process-execution': {}, 'code-execution': {}, 'npm-install-script': {}, 'steganography': {}}, 'path': '/tmp/tmp__2mob1r/great_expectations'}


4. See errors, these 2 issues should be probably fixed.

**Expected behavior**
After running the `scanner.scan_remote("great_expectations", "0.16.8")` or technically a newer version because this bug will be fixed, then the returned dictionary shall return 0 issues.

**Environment (please complete the following information):**
 - Operating System: Linux
 - Great Expectations Version: 0.16.8

**Additional context**
N/A
austiezr commented 1 year ago

Hey @lucasmengual92! Thanks for raising this. We're discussing internally and will be in touch. 🙇

lucasmengual92 commented 1 year ago

Hi @austiezr, alright great! Thanks for catching up on this issue quickly, and hopefully it can be fixed.

lucasmengual92 commented 1 year ago

Hi @austiezr, is there any update on this issue? Regards!

molliemarie commented 1 month ago

Hello @lucasmengual92. With the upcoming launch of Great Expectations Core (GX 1.0), we are closing old issues posted regarding previous versions. Moving forward, we will focus our resources on supporting and improving GX Core (version 1.0 and beyond). If you find that an issue you previously reported still exists in GX Core, we encourage you to resubmit it against the new version. With more resources dedicated to community support, we aim to tackle new issues swiftly. For specific details on what is GX-supported vs community-supported, you can reference our integration and support policy.

To get started on your transition to GX Core, check out the GX Core quickstart (click “Full example code” tab to see a code example).

You can also join our upcoming community meeting on August 28th at 9am PT (noon ET / 4pm UTC) for a comprehensive rundown of everything GX Core, plus Q&A as time permits. Go to https://greatexpectations.io/meetup and click “follow calendar” to follow the GX community calendar.

Thank you for being part of the GX community and thank you for submitting this issue. We're excited about this new chapter and look forward to your feedback on GX Core. 🤗