Closed omerHofBGU closed 3 months ago
+1
Hi, thanks for your interest. The code dataset we used it massive but https://raw.githubusercontent.com/meta-llama/PurpleLlama/main/CybersecurityBenchmarks/datasets/third-party.txt is a list of all the repos we scraped there.
Thank you for your interest!
Hi,
Thank you for sharing this excellent project!
I am interested in contributing to its expansion by creating additional rules for insecure code and extending the 'instruct' and 'autocomplete' files.
Could you please share the open-source code dataset that you used to extract the insecure code using the predefined rules? I have not been able to locate it. This would help me verify my rules and export insecure code snippets to the respective files. While I understand that any open-source dataset could be used, I believe it would be more appropriate to use the original data you utilized.
Additionally, could you explain how you created the regex rules from MITRE? Specifically:
Thank you!