github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.59k stars 1.52k forks source link

Failed to create database from node module #11102

Open kal-purush opened 1 year ago

kal-purush commented 1 year ago

I wanted to create a database from the "installed" node module. But if I run this command

codeql database create --language=javascript --source-root ../node_modules/101 test-javascript-database It always failed and gives this error. [2022-11-03 12:59:36] [build-stderr] No JavaScript or TypeScript code found.

Obviously, there are javascript files there, and it is printed on the log also. For example this

Extracting ../codeql/javascript/tools/data/externs/nodejs/repl.js
[2022-11-03 12:59:33] [build-stdout] Done extracting ../codeql/javascript/tools/data/externs/nodejs/repl.js (15 ms)  

But I can create the database if the path is a normal GitHub repo. Is there any way to create database from node modules, or codeql does not support this?

smowton commented 1 year ago

What happens if you cd ../node_modules/101 && codeql database create --language=javascript /some/dir/test-javascript-database ?

That file repl.js is a standard library file; I suspect it's not actually extracting any code under that 101 directory. The code to extract is found by searching down from the current working directory; the --source-root option is used to nominate which files out of those extracted to consider user source (as compared to third-party libraries etc), not what to extract.

kal-purush commented 1 year ago

@smowton, the result is same.

I also think it is not extracting any code under the folder, and the only explanation I can think of is that the folder is under node_modules?

seng1e commented 1 year ago

This is because codeql excludes directories node_modules and bower_components by default when creating databases. Here is doc detail.

You should to move the target directory fromnode_moudle to others and try again.

aibaars commented 1 year ago

Yes that's right node_modules and bower_components are excluded by default. See: https://github.com/github/codeql/blob/eb365c1d24f967d3ab901ccd5646551ae995e50d/javascript/extractor/src/com/semmle/js/extractor/AutoBuild.java#L397

I think you can use the LGTM_INDEX_INCLUDE environment variable to override this behaviour. This variable is "documented" at https://github.com/github/codeql/blob/eb365c1d24f967d3ab901ccd5646551ae995e50d/javascript/extractor/src/com/semmle/js/extractor/AutoBuild.java#L77 .

seng1e commented 1 year ago

@aibaars I try to set environment variable LGTM_INDEX_INCLUDE=node_modules to run codeql database create, but not success. nothing could be extracted. I don't know if my usage is wrong or if this variable is only used to specify the directory and does not override the previous exclude operation.

aibaars commented 1 year ago

@seng1e Looking a bit more carefully at the code, I think the right variable to set would be LGTM_INDEX_FILTERS. See also: https://github.com/github/codeql/blob/eb365c1d24f967d3ab901ccd5646551ae995e50d/javascript/extractor/src/com/semmle/js/extractor/AutoBuild.java#L397-L411

I guess setting it to "LGTM_INDEX_FILTERS=include:**/node_modules" should do the trick.