Closed prayas7102 closed 1 month ago
Hey there, I put together a quick script to see if I understood the workings. I would like to know if it is something that could satisfy the 3 steps, mentioned in the issue, before I'd look into adding a PR. Since, I am not particularly familiar with NPM packaging, and haven't been upto date with JS/TS recently. Cheers :)
Hey there, I put together a quick script to see if I understood the workings. I would like to know if it is something that could satisfy the 3 steps, mentioned in the issue, before I'd look into adding a PR. Since, I am not particularly familiar with NPM packaging, and haven't been upto date with JS/TS recently. Cheers :)
View code
import natural from "natural"; // prettier-ignore // Import specific datasets with defined interfaces import { dataset as bruteForceData, DatasetSample, } from "./dataset_brute_force_attack"; import { dataset as inputValidationData } from "./dataset_input_validation"; import { dataset as insecureAuthData } from "./dataset_insecure_authentication"; import { dataset as securityHeadersData } from "./dataset_security_headers"; // Environment checks for debugging and development modes const IS_DEBUG = // @ts-ignore: 2304 (typeof Deno !== "undefined" && Deno.env.get("DEBUG_MODE") === "true") || (typeof process !== "undefined" && process.env.DEBUG_MODE === "true"); // prettier-ignore const IS_DEVELOPMENT = // @ts-ignore: 2304 (typeof Deno !== "undefined" && Deno.env.get("NODE_ENV") === "development") || (typeof process !== "undefined" && process.env.NODE_ENV === "development"); /** $ DEBUG_MODE=true deno run --allow-env scratch.ts */ /** $ NODE_ENV=development deno run --allow-env scratch.ts */ function debugDetectionStats( kind: Vulnerability, tokenizedSnippet: readonly string[] | null, labels: readonly number[], prediction: string, ) { console.debug({ labels, prediction, result: parseInt(prediction, 10), tokenizedSnippet, vulnerability: vulnerabilityToString(kind), }); } class ThreadSafeLogger { private buffer: string[] = []; private readonly BUFFER_FLUSH_LIMIT = 10; // NOTE: Adjust buffer size limit... report(message: string): void { this.buffer.push(message); if (this.buffer.length >= this.BUFFER_FLUSH_LIMIT) { this.flush(); } } flush(): void { if (this.buffer.length > 0) { console.log(this.buffer.join("\n")); this.buffer = []; // clear the buffer } } } /** * Enumerates types of vulnerability. * Note: Defined as a read-only frozen object to prevent modifications. */ const Vulnerability = Object.freeze({ BruteForceAttack: 0, InputValidation: 1, InsecureAuthentication: 2, SecurityHeaders: 3, } as const); type Vulnerability = (typeof Vulnerability)[keyof typeof Vulnerability]; // prettier-ignore function vulnerabilityToString(kind: Vulnerability): string { switch (kind) { case Vulnerability.BruteForceAttack: return "Brute Force Attack"; case Vulnerability.InputValidation: return "Input Validation"; case Vulnerability.InsecureAuthentication: return "Insecure Authentication"; case Vulnerability.SecurityHeaders: return "Security Headers"; default: throw new Error("Exhausted all 'switch' cases."); } } // prettier-ignore function getVulnerabilityData(kind: Vulnerability): DatasetSample[] { switch (kind) { case Vulnerability.BruteForceAttack: return bruteForceData; case Vulnerability.InputValidation: return inputValidationData; case Vulnerability.InsecureAuthentication: return insecureAuthData; case Vulnerability.SecurityHeaders: return securityHeadersData; default: throw new Error("Exhausted all 'switch' cases."); } } // Removing redundant data function function removeRedundantData( dataset: readonly DatasetSample[], ): DatasetSample[] { const uniqueEntries: DatasetSample[] = []; const seenEntries: Set<string> = new Set(); let count0: number = 0; let count1: number = 0; for (const entry of dataset) { const entryString = JSON.stringify(entry); if (!seenEntries.has(entryString)) { if (IS_DEBUG && IS_DEVELOPMENT) { if (entry.label === 0) count0++; else count1++; } uniqueEntries.push({ ...entry }); // copy ensures immutability seenEntries.add(entryString); } } if (IS_DEBUG && IS_DEVELOPMENT) { console.debug({ count0, count1, entriesLen: uniqueEntries.length }); } return uniqueEntries; } // prettier-ignore function detect(kind: Vulnerability, codeSnippet: string, logger: ThreadSafeLogger): boolean { let isDetected: boolean = false; const kindStr: string = vulnerabilityToString(kind); // Create a tokenizer. const tokenizer = new natural.WordTokenizer(); const tokenizedSnippet: readonly string[] | null = tokenizer.tokenize(codeSnippet); // Make prediction using the trained classifier. if (tokenizedSnippet !== null) { // Prepare the data for training. const data: readonly DatasetSample[] = getVulnerabilityData(kind); const cleanedDataset: readonly DatasetSample[] = removeRedundantData(data); const codeSamples: readonly string[] = cleanedDataset.map((sample) => sample.code); const labels: readonly number[] = cleanedDataset.map((sample) => sample.label); // Vectorize the code samples using the tokenizer. const tokenizerSamples: (readonly string[])[] = codeSamples .map((code) => tokenizer.tokenize(code)) .filter((tokens): tokens is string[] => tokens !== null); // Train a Naive Bayes classifier const classifier = new natural.BayesClassifier(); for (let i = 0; i < tokenizerSamples.length; i++) { classifier.addDocument([...tokenizerSamples[i]], labels[i].toString()); // copy ensures immutability } classifier.train(); const prediction: string = classifier.classify([...tokenizedSnippet]); const result: number = parseInt(prediction, 10); if (result === 1) { isDetected = true; logger.report("==> Code vulnerable to " + kindStr + " in this file!!! "); } if (IS_DEBUG) { debugDetectionStats(kind, tokenizedSnippet, labels, prediction); } } if (!isDetected) { logger.report("==> Code NOT vulnerable to " + kindStr); } return !isDetected; } function main(): number { const vulnerabilities: readonly Vulnerability[] = Object.values( Vulnerability, ).filter((value): value is Vulnerability => typeof value === "number"); const codeSnippetSample = "const loginLimiter = rateLimit({\n" + " store: new MongoStore({\n" + " uri: 'mongodb://localhost:27017/ratelimits',\n" + " expireTimeMs: 60 * 1000, // 1 minute\n" + " }),\n" + " max: 5,\n" + " message: 'Too many login attempts from this IP, please try again later.'\n" + " });"; let exitStatus: number = 0; for (const vulnerability of vulnerabilities) { const logger = new ThreadSafeLogger(); if (!detect(vulnerability, codeSnippetSample, logger)) { exitStatus = 1; } logger.flush(); } return exitStatus; } main();
Thanks for the proposed changes, would review it shortly.
@lloydlobo I've reviewed your changes, i've few suggestions:
// prettier-ignore
I've assigned this issue to you. Please proceed with your changes and ensure the previous and current terminal outputs are in sync. I'll handle the NPM packaging, don't worry about that.
@lloydlobo I've reviewed your changes, i've few suggestions:
1. Remove code related to Thread buffer, main function, environment mode for the time being. 2. Remove this comment, if possible : `// prettier-ignore`
I've assigned this issue to you. Please proceed with your changes and ensure the previous and current terminal outputs are in sync. I'll handle the NPM packaging, don't worry about that.
Hey @prayas7102 I will look into this probably in 24 hours from now. Bit busy with a prior obligation,
Sure thing, also a relief to know that you can handle the NPM packaging.
Hey, there is a draft PR #16 in the works.
I encountered a potential issue mentioned in the PR regarding multiple logs:
When Log.detectIfVulnerability function is declared in a scope outside of class Log, the logger logs twice.
- Need to look into how the overrides on console.log and console.error influence the above.
- For the time being, the detectIfVulnerability function is declared as a static method of class Log.
Besides, minor formatting tweaks, the PR seems ready.
Hey there @prayas7102, PR #16 is ready for review :)
You can consolidate the four files (
DetectBruteForceAttack.ts
,DetectInputValidation.ts
,InsecureAuthentication.ts
, andAnalyzeSecurityHeaders.ts
) into a single script since they share common libraries and calling functions. The only variation between these files is the dataset, which can be loaded based on the specific vulnerability check being performed.By creating a single script, you can automate the detection of brute force attacks, input validation, insecure authentication, and security header analysis. This script can selectively load the appropriate dataset according to the type of vulnerability being checked, making the process more efficient and reducing code duplication.
Key steps:
This approach simplifies the process and ensures scalability when adding new vulnerability checks in the future.
Make sure the end user/developer (who downloads the NPM package) is able to smoothly run the NPM package after these changes.