Closed evgenydmitriev closed 1 year ago
You can create an array of data with confidential names in the form key => value, leave the keys in the repository, and transfer the bundle separately and restore full data through a script or regular expressions.
git filter-repo tool and the BFG Repo-Cleaner
There are some softwares for scanning committed secrets in repositories in Github. Namely: GitSecrets, Trufflehog, GitHound and many others. Two below have alerts:
Git Guardian https://www.gitguardian.com/ Secret Scanning https://github.com/KainosSoftwareLtd/secret-scanning
Credential Digger https://github.com/SAP/credential-digger Gittyleaks Repo Supervisor
SpectralOps https://spectralops.io Whispers https://github.com/Skyscanner/whispers GitGuardian https://www.gitguardian.com
Hi, is there a chance some of the links mentioned in the comments that are to be assumed incorrect are correct?
Everything mentioned above is not addressing the stated problem.
GitRob Truffle Hog git-secrets
GitRob TruffleHog
Hi. Guess these tools should be able to automatically detect what is sensitive content and what is not. But the issues are always diffeent thus the tool should eiher contain information about all possibly sensitive content in the world or be instructed every time the new issue is taken. The first case seems to be a future of the artifical intelligence (not shure if a nearest one), the second renders the method to manual rather than automatical as per challenge conditions. So there are no tools to solve the task and no relevant links could be provided accordingly.
Hi there! I would use Microsoft Azure Information Protection and Microsoft PowerShell tools. Thank you for this interesting challenge!
Hello, I would use GitMonitor or GitGraber as they offer real-time monitoring to find sensitive data. Regards
GitMonitor Git-secrets
SEDATED: https://github.com/OWASP/SEDATED confidential information in images: https://github.com/DhilipSanjay/Detection-of-Sensitive-Data-Exposure-in-Images GitMAD: https://github.com/deepdivesec/GitMAD DataDefender: https://github.com/armenak/DataDefender SMB-Data-Discovery: https://github.com/gh0x0st/SMB-Data-Discovery
I would do it like this (using regex):
import re
reg_expr = r'[A-Z]\w+\s+[A-Z]\w+|[А-Я]\w+\s+[А-Я]\w+' # regex for 1st and last names (both latin and cyrillic) re.sub(reg_expr, '', text) # here we simply remove samples matching our regex from text
If you wanna flag them:
def first_and_last_names_filter(text): reg_expr = r'[A-Z]\w+\s+[A-Z]\w+|[А-Я]\w+\s+[А-Я]\w+' nameregex = re.compile(reg_expr) names = nameregex.findall(text) return names
Additionally, you can add to reg_ex: '|[#\d%\d]' this will delete from text any digits start from # (numbers) and digits end with % (percents). It's just an example.
For complex cases in flaggind sensetive data info I'd use: I would also suggest using regex, as soon as answers with GitHub repos considered as incorrect =)
@ingakaspar successfully solved the challenge and was hired by Inca Digital! Congrats 🎉
The challenge is still open. No need to repeat what has been posted above. None of the submissions address the problem stated in the issue description.
Kaggle https://www.kaggle.com/datasets NLTK https://www.nltk.org/
bugcrowd https://www.bugcrowd.com/ gitrob https://github.com/michenriksen/gitrob
Git Monitor https://github.com/Talkaboutcybersecurity/GitMonitor TruffleeHog https://github.com/trufflesecurity/trufflehog
Hi! submitted the assignment, hope it'll get your interest
The challenge is still open. No need to repeat what has been posted above. None of the submissions address the problem stated in the issue description.
We often come across situations when people unintentionally share sensitive information in GitHub Issue Tracker, such as names of clients or details of active investigations. Provide links to 2-3 tools you would use to automatically flag comments with sensitive content in our GitHub issue tracker.
Additional Resources
Please email challenge-submission@blockshop.org with your solution, and don't forget to include a link to this issue and attach your resume. Don't hesitate to ask us questions by commenting in this issue or emailing us at challenge-program@blockshop.org
Successful submissions
🎉 @ingakaspar successfully solved the challenge and was hired by Inca Digital.
The challenge is still open. We are removing comments with correct answers to allow others to participate, so it is safe to assume that the answers listed below are incorrect.