1712n / challenge

Challenge Program
65 stars 27 forks source link

Flag GitHub issues #63

Closed evgenydmitriev closed 1 year ago

evgenydmitriev commented 3 years ago

We often come across situations when people unintentionally share sensitive information in GitHub Issue Tracker, such as names of clients or details of active investigations. Provide links to 2-3 tools you would use to automatically flag comments with sensitive content in our GitHub issue tracker.

Additional Resources

Please email challenge-submission@blockshop.org with your solution, and don't forget to include a link to this issue and attach your resume. Don't hesitate to ask us questions by commenting in this issue or emailing us at challenge-program@blockshop.org


Successful submissions

🎉 @ingakaspar successfully solved the challenge and was hired by Inca Digital.


The challenge is still open. We are removing comments with correct answers to allow others to participate, so it is safe to assume that the answers listed below are incorrect.

SergyKo commented 2 years ago

You can create an array of data with confidential names in the form key => value, leave the keys in the repository, and transfer the bundle separately and restore full data through a script or regular expressions.

elviranasirova commented 2 years ago

git filter-repo tool and the BFG Repo-Cleaner

AliceFender commented 2 years ago

There are some softwares for scanning committed secrets in repositories in Github. Namely: GitSecrets, Trufflehog, GitHound and many others. Two below have alerts:

Git Guardian https://www.gitguardian.com/ Secret Scanning https://github.com/KainosSoftwareLtd/secret-scanning

anastasiasafargalieva commented 2 years ago

Credential Digger https://github.com/SAP/credential-digger Gittyleaks Repo Supervisor

olga-sokolovskaya commented 2 years ago

SpectralOps https://spectralops.io Whispers https://github.com/Skyscanner/whispers GitGuardian https://www.gitguardian.com

maximkondrushin commented 2 years ago

Hi, is there a chance some of the links mentioned in the comments that are to be assumed incorrect are correct?

evgenydmitriev commented 2 years ago

Everything mentioned above is not addressing the stated problem.

annamalakhova777 commented 2 years ago

GitRob Truffle Hog git-secrets

NataliaKramar commented 2 years ago

GitRob TruffleHog

mike-ralenko commented 2 years ago

Hi. Guess these tools should be able to automatically detect what is sensitive content and what is not. But the issues are always diffeent thus the tool should eiher contain information about all possibly sensitive content in the world or be instructed every time the new issue is taken. The first case seems to be a future of the artifical intelligence (not shure if a nearest one), the second renders the method to manual rather than automatical as per challenge conditions. So there are no tools to solve the task and no relevant links could be provided accordingly.

Prokopajte commented 2 years ago

Hi there! I would use Microsoft Azure Information Protection and Microsoft PowerShell tools. Thank you for this interesting challenge!

Chikrovi commented 2 years ago

Hello, I would use GitMonitor or GitGraber as they offer real-time monitoring to find sensitive data. Regards

AktRus commented 2 years ago

GitMonitor Git-secrets

XeniyaO commented 2 years ago

SEDATED: https://github.com/OWASP/SEDATED confidential information in images: https://github.com/DhilipSanjay/Detection-of-Sensitive-Data-Exposure-in-Images GitMAD: https://github.com/deepdivesec/GitMAD DataDefender: https://github.com/armenak/DataDefender SMB-Data-Discovery: https://github.com/gh0x0st/SMB-Data-Discovery

JaneBrains commented 2 years ago

ccsrch (PANs): https://sourceforge.net/projects/ccsrch/ passhunt: https://pythonrepo.com/repo/Viralmaniar-Passhunt-python-administrative-interfaces

ana-k-2020 commented 2 years ago

gitleaks mb: gitmonitor git-secrets

ZahirNikmal commented 2 years ago

https://github.com/SAP/credential-digger.git https://github.com/awslabs/git-secrets.git

SwordForShinobi commented 2 years ago

I would do it like this (using regex):

import re

reg_expr = r'[A-Z]\w+\s+[A-Z]\w+|[А-Я]\w+\s+[А-Я]\w+' # regex for 1st and last names (both latin and cyrillic) re.sub(reg_expr, '', text) # here we simply remove samples matching our regex from text

If you wanna flag them:

def first_and_last_names_filter(text): reg_expr = r'[A-Z]\w+\s+[A-Z]\w+|[А-Я]\w+\s+[А-Я]\w+' nameregex = re.compile(reg_expr) names = nameregex.findall(text) return names

Additionally, you can add to reg_ex: '|[#\d%\d]' this will delete from text any digits start from # (numbers) and digits end with % (percents). It's just an example.

For complex cases in flaggind sensetive data info I'd use: I would also suggest using regex, as soon as answers with GitHub repos considered as incorrect =)

sndwhl commented 2 years ago

https://radar.nightfall.ai https://git-scm.com/docs/git https://github.com/hashicorp/terraform

evgenydmitriev commented 2 years ago

@ingakaspar successfully solved the challenge and was hired by Inca Digital! Congrats 🎉

The challenge is still open. No need to repeat what has been posted above. None of the submissions address the problem stated in the issue description.

lunevaalex commented 2 years ago

Kaggle https://www.kaggle.com/datasets NLTK https://www.nltk.org/

bugcrowd https://www.bugcrowd.com/ gitrob https://github.com/michenriksen/gitrob

Edgrudskiy commented 2 years ago

Git Monitor https://github.com/Talkaboutcybersecurity/GitMonitor TruffleeHog https://github.com/trufflesecurity/trufflehog

AESosnovsky commented 1 year ago

Hi! submitted the assignment, hope it'll get your interest

evgenydmitriev commented 1 year ago

The challenge is still open. No need to repeat what has been posted above. None of the submissions address the problem stated in the issue description.