Closed overfl0wd closed 5 years ago
Hi @overfl0wd,
As for implementing a support for new extension type seems like you did run through all of the necessary steps which are:
config/main.json
within allowedExtensions
section src/parser/tokenizer
(details are here: https://github.com/auth0/repo-supervisor/wiki/Custom-file-formats)config/filters.json
Did you rebuild the tool after making those changes? It's done with npm run build
command and it should generate new files for both dist/webtask.js
and dist/cli.js
.
Let me know if that helps.
In step 3 of that page, "Create new module src/parser/tokenizer/foobar/index.js that exports a single function that works with two parameters:", what should be inside the function?
I'm simply looking to run the same default high-entropy string checks, not write unique filters and re-invent the wheel here for a new file type.
I'm simply looking to run the same default high-entropy string checks, not write unique filters and re-invent the wheel here for a new file type.
The reason behind creating a new tokenizer (https://en.wikipedia.org/wiki/Lexical_analysis) is that you need to make a file format readable for the repo supervisor tool. For json
and js
file, the source code is parsed into functions, variables, strings... to be able to avoid calculating entropy on function or variable name instead of strings.
As for example if you would like to detect secrets inside Python variables, it would require to create a new tokenizer for Python but it doesn't necessary mean that it needs to be created from scratch, there are ready to use libraries for that.
If you could provide more details on what you want to achieve I can guide you through and suggest what would be the best approach to do that with repo supervisor.
Ah, thanks for breaking that down. Makes sense.
Our use case is scanning a few large repos, mostly containing .vb and .cs files. Ideally, we'd include the scattered .dat, .xml, .xaml, and .config files as well.
If you would like to allow repo-supervisor scan all of the file types without explicitly parsing them into understandable format (more like a text file just to search in it), it was mentioned once in a PR: https://github.com/auth0/repo-supervisor/pull/14
I hope it helps.
Hi, first of all thanks for developing this tool. I added new extensions to "allowedExtensions" in /config/main.json per this blog post., and they aren't being scanned. I even removed the default .js/.json entries and re-ran the tool, and it was still only returning results from those filetypes.