Closed nrathaus closed 4 months ago
The problem lies here:
def detect_exposure(data: str) -> dict:
# Dictionary to store detected data exposures
detected_exposures = {}
for pattern_name, pattern in sensitive_data_regex_patterns.items():
matches = findall(pattern, data)
if matches:
detected_exposures[pattern_name] = matches
return detected_exposures
If detect_expose
is sent:
{"codeclimate":"CODECLIMATE_REPO_TOKEN=62864c476ade6ab9d10d0ce0901ae2c211924852a28c5f960ae5165c1fdfec73","facebook":"EAACEdEose0cBAHyDF5HI5o2auPWv3lPP3zNYuWWpjMrSaIhtSvX73lsLOcas5k8GhC5HgOXnbF3rXRTczOpsbNb54CQL8LcQEMhZAWAJzI0AzmL23hZByFAia5avB6Q4Xv4u2QVoAdH0mcJhYTFRpyJKIAyDKUEBzz0GgZDZD","google_b64":"QUl6YhT6QXlEQnbTr2dSdEI1W7yL2mFCX3c4PPP5NlpkWE65NkZV","google_oauth":"188968487735-c7hh7k87juef6vv84697sinju2bet7gn.apps.googleusercontent.com","google_oauth_token":"ya29.a0TgU6SMDItdQQ9J7j3FVgJuByTTevl0FThTEkBs4pA4-9tFREyf2cfcL-_JU6Trg1O0NWwQKie4uGTrs35kmKlxohWgcAl8cg9DTxRx-UXFS-S1VYPLVtQLGYyNTfGp054Ad3ej73-FIHz3RZY43lcKSorbZEY4BI","heroku":"herokudev.staging.endosome.975138 pid=48751 request_id=0e9a8698-a4d2-4925-a1a5-113234af5f60","hockey_app":"HockeySDK: 203d3af93f4a218bfb528de08ae5d30ff65e1cf","outlook":"https://outlook.office.com/webhook/7dd49fc6-1975-443d-806c-08ebe8f81146@a532313f-11ec-43a2-9a7a-d2e27f4f3478/IncomingWebhook/8436f62b50ab41b3b93ba1c0a50a0b88/eff4cd58-1bb8-4899-94de-795f656b4a18","paypal":"access_token$production$x0lb4r69dvmmnufd$3ea7cb281754b7da7dac131ef5783321","slack":"xoxo-175588824543-175748345725-176608801663-826315f84e553d482bb7e73e8322sdf3"}
matches
contains:
[["188968487735", "", "", "", "", "", "188968487735", "188968487735", "", ""], ["175588824543", "", "", "", "", "", "175588824543", "175588824543", "", ""], ["175748345725", "", "", "", "", "", "175748345725", "175748345725", "", ""], ["176608801663", "", "", "", "", "", "176608801663", "176608801663", "", ""]]
There are two options:
[tuple(filter(None, item)) for item in matches]
Complete code:
def detect_exposure(data: str) -> dict:
# Dictionary to store detected data exposures
detected_exposures = {}
for pattern_name, pattern in sensitive_data_regex_patterns.items():
matches = findall(pattern, data)
if matches:
if isinstance(matches, list) and isinstance(matches[0], tuple):
matches = [tuple(filter(None, item)) for item in matches]
detected_exposures[pattern_name] = matches
return detected_exposures
Please note my PR contains this fix if you want to merge it
Please note my PR contains this fix if you want to merge it
I've added a suggestion, Can you commit it to your PR?
I don't see in the PR any suggestion
I don't see in the PR any suggestion
That seems to be weird.
Here's the requested code change for the corresponding PR:
if matches:
if isinstance(matches, list) and isinstance(matches[0], tuple):
matches = [tuple(filter(None, item)) for item in matches]
to
if matches:
if isinstance(matches, list) and isinstance(matches[0], tuple):
matches = set.union(*[set(match_tuple) for match_tuple in matches])
matches.discard('')
matches = list(matches)
have committed necessary changes and merged PR #104
Testing of
https://brokencrystals.com/api/secrets
returns empty fields in the array of leaked information: