Open HarshadDGhorpade-eaton opened 1 year ago
Have you tried using command:
as the property name instead of build:
?
Have you tried using
command:
as the property name instead ofbuild:
?
still the same error with command
.
Could you try removing the name:
property?
Could you try removing the
name:
property?
Doesnt work, tried keeping only command
/build
, still same.
Sorry, I think the documentation for the codescanning config file is the following https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/customizing-code-scanning#using-a-custom-configuration-file . I don't see any mention there for a command or build property.
The documentation you were referring to before can be used to provide default values for command line arguments.
I think neither is really suitable for your use-case. The easiest is probably to put the commands in a single shell script (for example build.sh
) and run codeql database create --language cpp --command ./build.sh ....
Note that CodeQL may automatically recognize build.sh
as a build script, so things may even work if you leave out --command ./build.sh
.
Sorry, I think the documentation for the codescanning config file is the following https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/customizing-code-scanning#using-a-custom-configuration-file . I don't see any mention there for a command or build property.
The documentation you were referring to before can be used to provide default values for command line arguments.
I think neither is really suitable for your use-case. The easiest is probably to put the commands in a single shell script (for example
build.sh
) and runcodeql database create --language cpp --command ./build.sh ....
Note that CodeQL may automatically recognize
build.sh
as a build script, so things may even work if you leave out--command ./build.sh
.
Thanks alot for this, it worked this way, sadly it's not mentioned in the doc anywhere but your replies were faster that solved issue quickly.
while trying this out we're facing another issue :
ERROR: ld.so: object '/mnt/work/codeql/tools/linux64/lib64trace.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS64): ignored.
......
ERROR: ld.so: object '/mnt/work/codeql/tools/linux64/lib64trace.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS64): ignored.
not getting what's issue here, this is causing failures in codeql database create
command step in azure pipeline.
is this error harmless as written here ? not exactly the same error though.
I tried proceeding further with codeql database analyze
but its saying "generated db needs to be finalized before running queries; please run codeql database finalize"
do I need to add codeql database finalize
?
while trying this out we're facing another issue :
ERROR: ld.so: object '/mnt/work/codeql/tools/linux64/lib64trace.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS64): ignored. ...... ERROR: ld.so: object '/mnt/work/codeql/tools/linux64/lib64trace.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS64): ignored.
not getting what's issue here, this is causing failures in
codeql database create
command step in azure pipeline. is this error harmless as written here ? not exactly the same error though.
Those error could be harmless if they only happen on processes the CodeQL analyser does not care about. In the linked example of a similar error, CodeQL was running inside a docker container and the error was reported on things running on the host machine.
In your case the errors suggest that CodeQL may be wrongly setting the LD_PRELOAD
variable to the 64bit library for 32 bit processes. CodeQL needs to be able to "see" all compiler processes to figure out how to analyse the source code. So if the 32bit processes that are ignored happen to be the build scripts and compiler processes then CodeQL won't "see" a thing, and you end up with an empty database.
I'm a little surprised by the LD_PRELOAD
value, I think it normally looks like /mnt/work/codeql/tools/linux64/${LIB}trace.so
, and ld.so
expands ${LIB}
to the suitable value depending on whether the process is 32 or 64 bits. Could you validate that the LIB
placeholder is not accidentally interpreted/replaced by your azure pipeline scripts?
I tried proceeding further with
codeql database analyze
but its saying "generated db needs to be finalized before running queries; please run codeql database finalize"do I need to add
codeql database finalize
?
No, normally codeql database create
will run codeql database finalize
automatically, except when database creation failed in an earlier step. You could try running codeql database finalize
but even if it doesn't fail completely, you still end up with a partial database.
Could you validate that the LIB placeholder is not accidentally interpreted/replaced by your azure pipeline scripts?
I am not sure what and how to check that, can you please elaborate on this ?
Could you validate that the LIB placeholder is not accidentally interpreted/replaced by your azure pipeline scripts?
I am not sure what and how to check that, can you please elaborate on this ?
Could you try running printenv
(or another command that prints the environment such as set
or export
) in your build script and look for the value of LD_PRELOAD
?
Does your azure pipeline run a simple codeql database create
command, or does it try to do more fancy things by setting special environment variables or using features like indirect build tracing ? If you're using a simple codeql database create
then things should just work.
Could you also check which operating system and version is running on the azure devops workers? Do they run in docker or is some kind of virtualization or WSL in use? Perhaps running in a container may somehow confuse the code that detects whether a binary is 32 or 64bit .
I dont see Earlier I was printing env vars after the db creation command in a separate step so getting nothing, when printing same in build.sh, I do find LD_PRELOAD
environment variable after doing.LD_PRELOAD=/mnt/work/codeql/tools/linux64/lib64trace.so
database creation command :
codeql database create --language cpp --github-url=https://github.com/ --command ./build.sh --source-root . db
OS details : Distributor ID: Ubuntu Description: Ubuntu 20.04.6 LTS Release: 20.04 x86_64
We're running on Azure cloud agents, nothing on container.
database-create-20230622.152801.058.log Uploading database create log just in case if it helps.
separate step so getting nothing, when printing same in build.sh, I do find
LD_PRELOAD=/mnt/work/codeql/tools/linux64/lib64trace.so
Thanks! Could you attach a list of all environment variables containing any of the words PRELOAD
, SEMMLE
, ODASA
, and CODEQL
?
Could you also attach the build-tracer.log
file to this issue?
Was trying to attach build-tracer.log but its more than 2GB.. zipped version goes ~180MB.. can attached only until 25 MB here.
PRELOAD --> SEMMLE_PRELOADlibtrace=/mnt/work/codeql/tools/linux64/${LIB}${PLATFORM}_trace.so SEMMLE_PRELOAD_libtrace32=/mnt/work/codeql/tools/linux64/lib32trace.so SEMMLE_PRELOAD_libtrace64=/mnt/work/codeql/tools/linux64/lib64trace.so LD_PRELOAD=/mnt/work/codeql/tools/linux64/lib64trace.so
SEMMLE --> SEMMLE_PRELOADlibtrace=/mnt/work/codeql/tools/linux64/${LIB}${PLATFORM}_trace.so SEMMLE_PRELOAD_libtrace32=/mnt/work/codeql/tools/linux64/lib32trace.so SEMMLE_PRELOAD_libtrace64=/mnt/work/codeql/tools/linux64/lib64trace.so SEMMLE_EXEC=
ODASA --> nothing
CODEQL --> CODEQL_EXTRACTOR_CPP_TRAP_DIR=/mnt/work/1/s/edge-linux-yocto/yocto_db/trap/cpp CODEQL_TRACER_DIAGNOSTICS_DIR=/mnt/work/1/s/edge-linux-yocto/yocto_db/diagnostic/tracer CODEQL_EXTRACTOR_CPP_LOG_DIR=/mnt/work/1/s/edge-linux-yocto/yocto_db/log CODEQL_EXTRACTOR_CPP_SOURCE_ARCHIVE_DIR=/mnt/work/1/s/edge-linux-yocto/yocto_db/src CODEQL_PLATFORM_DLL_EXTENSION=.so CODEQL_EXTRACTOR_CPP_DIAGNOSTIC_DIR=/mnt/work/1/s/edge-linux-yocto/yocto_db/diagnostic/extractors/cpp CODEQL_EXTRACTOR_CPP_WIP_DATABASE=/mnt/work/1/s/edge-linux-yocto/yocto_db CODEQL_JAVA_HOME=/mnt/work/codeql/tools/linux64/java CODEQL_EXTRACTOR_CPP_SCRATCH_DIR=/mnt/work/1/s/edge-linux-yocto/yocto_db/working CODEQL_DIST=/mnt/work/codeql CODEQL_PLATFORM=linux64 CODEQL_SCRATCH_DIR=/mnt/work/1/s/edge-linux-yocto/yocto_db/working CODEQL_TRACER_LANGUAGES=cpp CODEQL_TRACER_LOG=/mnt/work/1/s/edge-linux-yocto/yocto_db/log/build-tracer.log CODEQL_EXTRACTOR_CPP_ROOT=/mnt/work/codeql/cpp CODEQL_PARENT_ID=0000000000001561_0000000000000003 CODEQL_EXEC_ARGS_OFFSET=
Was trying to attach build-tracer.log but its more than 2GB.. zipped version goes ~180MB.. can attached only until 25 MB
Ah indeed, the tracer log can be very large. Could you search for "interesting" fragments of the tracer log. I think the first 1000 lines are interesting, and any blocks of text ending with a Catastrophic error
messages. If there are many errors then just select a sample of a few of them.
You can also create an enterprise support ticket and use the upload large files functionality.
build-tracer-lines_0_1200-and-catastrophic-error.log
Adding some chunks from original build-tracer.log... I do see the same pattern repeated for "Catastrophic error" complaining about not able to open file.
A team member mentioned: The tracer is trying to sniff out which type 32bit or 64bit a binary is, and insert the correct library for that, and only falls back to the generic LIB expansion in case it doesn't manage to do that. Maybe we're hitting a weird special case here that is confusing the detection logic? There's log messages for that, but they are not be enabled at the default log level, The string is detected as: , and the logging is enabled with setting the environment variable SEMMLE_DEBUG_TRACER to 6.
The log will be even larger. One way to reduce the log size would be to build only a smaller part of the code that still exhibits the same problem.
To make sense of the log, we'd need to correlate the detected filetype from the log for a binary with the actual filetype of the binary that's emitting those error messages, and I don't see the name of that anywhere in the issue. Do you know which process is printing the LD_PRELOAD
related error messages?
Adding some chunks from original build-tracer.log... I do see the same pattern repeated for "Catastrophic error" complaining about not able to open file.
Yes indeed. The good news is that CodeQL seems to be able to intercept compiler calls. The error messages are a bit unexpected, but the sampled ones all look like part of the "configure" phase of the build. Could you look for a few samples of Catastrophic error messages that mention source files from the repository you'd like to analyse?
@HarshadDGhorpade-eaton , looking at the database create
log file, I realised that the error is happening very near the end (task 6330 of 6338 ).
[2023-06-22 16:30:50] [build-stdout] NOTE: Running noexec task 6330 of 6338 (/mnt/work/3/s/edge-linux-yocto/meta-pxred/meta-bsp-stm32mp1/recipes-kernel/linux/linux-stm32mp-ipl.bb:do_build)
[2023-06-22 16:31:51] [build-stderr] ERROR: px-red-image-1.0-4r6 do_rootfs: [log_check] px-red-image: found 2 error messages in the logfile:
[2023-06-22 16:31:51] [build-stderr] [log_check] ERROR: ld.so: object '/mnt/work/codeql/tools/linux64/lib64trace.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS64): ignored.
[2023-06-22 16:31:51] [build-stdout] NOTE: recipe px-red-image-1.0-4r6: task do_rootfs: Failed
It is very likely that all steps of the build that are of interest to CodeQL (compiling and linking) had already succeeded. You could try to add || true
to your build command to make it always succeed. It's not pretty, but it would make the build "succeed" after 6330 tasks. With a bit of luck the remaining 8 tasks are not interesting. It's quite likely that they are related to packaging the generated build artefacts or so. My only worry is that some of those remaining 8 tasks are related to linking binaries and library artefacts. CodeQL normally works fine if we miss out on those, however, for large and complex builds the linker information may be needed for disambiguation of (function) names. Without this you may get occasional confusing results when CodeQL mixes up functions with the same names defined in completely unrelated components.
The build seems to fail because log_check
detects those ERROR
messages in the log, so if there is a way to tell log_check
that LD_PRELOAD related errors are "fine" then you should have a short term workaround that is a bit more reliable than adding || true
;-)
You could try to add
|| true
to your build command to make it always succeed. It's not pretty, but it would make the build "succeed" after 6330 tasks. With a bit of luck the remaining 8 tasks are not interesting. It's quite likely that they are related to packaging the generated build artefacts or so. My only worry is that some of those remaining 8 tasks are related to linking binaries and library artefacts. CodeQL normally works fine if we miss out on those, however, for large and complex builds the linker information may be needed for disambiguation of (function) names. Without this you may get occasional confusing results when CodeQL mixes up functions with the same names defined in completely unrelated components.
yes, you're right.. noticing its failing in last stages I tried proceeding further with codeql database analyze
but its saying "generated db needs to be finalized before running queries; please run codeql database finalize"
do I need to add codeql database finalize ?
https://github.com/github/codeql/issues/13524#issuecomment-1602282964
do I need to add
codeql database finalize
?
That should work too in this case. I'd normally avoid carrying on after codeql database create
fails, but in this case it fails so close to the end that it is probably fine. Also note that you cannot run codeql database finalize
after a successful run of codeql database create
.
Under the hood the codeql database create
command runs codeql database init
, codeql database trace-command
and codeql database finalize
.
okay, latest build gone past this and now saying :
Running queries.
A fatal error occurred: Query pack security-extended cannot be found. Check the spelling of the pack.
command :
codeql database analyze --format=sarif-latest --output=./temp/results-cpp.sarif db security-extended
I can't pass suite name security-extended
here ? we do have github action for other project which is using this :
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}
queries: security-extended
codeql database analyze
is this the correct way ?
I can't pass suite name
security-extended
here ? we do have github action for other project which is using this :
The name of the query suite is actually cpp-security-extended
. The github action internally prefixes the security-extended
name with the identifier of the language.
The cpp-code-scanning.qls
file corresponds to the code-scanning
query suite in the github action. It is fine to use, but if you want security-extended
then you need to run codeql/cpp-queries:codeql-suites/cpp-security-extended.qls
(or cpp-security-extended
for short).
okay, we're now able to generate database, analyze it and upload results to github repo, thanks for the apt response from your side, appreciate it.
I have shared the logs zip(containing tracer log and db creation logs) in a github repo setup by your colleauge.
we will have to find a way to get rid of this "LD_PRELOAD" error, for now its okay to continue despite error knowing its not affecting the data codeql needed but this will allow real errors to go through as well.
Our build process comprises of 4-5 commands so trying to use config file and use it in command but getting error as "Invalid property specified in the configuration file. Ignoring it and proceeding"
as per using-a-codeql-configuration-file, does config file gets used internally by
codeql database create
without specifying--codescanning-config
option ? Some internet sources talks about yaml-based config file as well and using it with--codescanning-config
option, can you please clarify what's correct way to use config file?I am trying this way :
codeql database create --language=cpp --github-url=https://github.com/ --codescanning-config=../codeql-config.yml --source-root . db
where codeql-config.yml file contents are like below :
getting below error :
Specifying multiple commands works but that becomes not maintainable as commands are lengthy :
codeql database create --command "cmd1" --command "cmd2" --command "cmd3" --command "cmd4" --language=cpp --github-url=https://github.com/ --source-root . db