github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.51k stars 1.49k forks source link

(WSL ubuntu)codeql with chormium create database failed #13562

Open 18Fl opened 1 year ago

18Fl commented 1 year ago

Introduction

Hey, This is a separate bug from https://github.com/github/codeql/issues/13552. when I try to slove the old bug. I want to see if it is a platform specific problem . so I use wsl ubuntu on windows do another test. and found this bug.

Here is my chormium version:

18f@DESKTOP-ETKSDTV:~/chromium/src$ git log
commit 5394a50e522f99ac693b9f2195adab250ef40fe2 (HEAD, tag: 116.0.5791.0)
Author: Chrome Release Bot (LUCI) <chrome-official-brancher@chops-service-accounts.iam.gserviceaccount.com>
Date:   Thu May 25 03:04:37 2023 +0000

    Publish DEPS for 116.0.5791.0

and the args.gn is simple:

is_debug = false

and ubuntu version:

18f@DESKTOP-ETKSDTV:~/chromium/src$ uname -a
Linux DESKTOP-ETKSDTV 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

when we compile chrome by this command:

autoninja -C out/codeql chrome

now bug happened. I split it into 2 cases. the first one is normal. but second one is the bug. but I think I list the 2 cases will help us found the bug.

and I tried codeql differenct version:

2.13.4 and 2.13.1

case 00

case 00 is normal. when we finish the chrome build. we delete the file

out/codeql/obj/content/browser/browser/input_handler.o

then we create codeql database by this:

codeql database create ../../code_db/boom/  --overwrite --language=cpp --command='autoninja -C out/codeql chrome'

I could confirm the codeql database create successfully. here is the snip(original log is too long...):

[2023-06-25 22:25:56] [build-stdout] [2112/2112] LINK ./chrome
Finalizing database at /home/18f/code_db/boom.
Successfully created database at /home/18f/code_db/boom.

we use this simple case to help us query:

/**
 * @name just for fun
 */

import cpp
import semmle.code.cpp.dataflow.new.DataFlow
import DataFlow::PathGraph

from Function func
where func.getQualifiedName().matches("%StringToGestureSourceType%")
select func

u could see everything works well.

image

This is a normal case, but I still offer the codeql database(you could see in boom.zip) , then you could diff with unnormal case. boom.zip

case 01

now we reproduce the unnormal case, jut compile full chrome like before. then we delete the all .o file in out/codeql/obj/content/browser/browser.

note that we don't need "gn gen out/codeql" again, becasue this folder only have *.o file. so it's ok(even u use "gn gen ..." it will still failed... I have tried).

  276  cd out/codeql/obj/content/browser/browser/
  277  ls
  278  rm *

then we still use this to create codeql database file:

 codeql database create ../../code_db/repro/  --overwrite --language=cpp --command='autoninja -C out/codeql chrome'

now this time you will see these log:


Finalizing database at /home/18f/code_db/repro.
10654892_0.trap.br for no link target, 1: java.io.IOException: Brotli stream decoding failed
org.brotli.dec.BrotliInputStream.read(BrotliInputStream.java:151)
com.semmle.inmemory.trap.TrapInputStream.read(TrapInputStream.java:60)
com.semmle.inmemory.trap.TrapScanner.fill(TrapScanner.java:449)
com.semmle.inmemory.trap.TrapScanner.ensureNext(TrapScanner.java:426)
com.semmle.inmemory.trap.TrapScanner.nextToken(TrapScanner.java:59)
com.semmle.inmemory.trap.TRAPReader.scanTuplesAndLabels(TRAPReader.java:488)
com.semmle.inmemory.trap.TRAPReader.importTuples(TRAPReader.java:410)
com.semmle.inmemory.trap.ImportTasksProcessor.process(ImportTasksProcessor.java:190)
com.semmle.inmemory.trap.ImportTasksProcessor.lambda$importTrap$1(ImportTasksProcessor.java:146)
com.semmle.util.concurrent.FutureUtils.lambda$mapAsync_$8(FutureUtils.java:161)
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
java.base/java.lang.Thread.run(Unknown Source)
Successfully created database at /home/18f/code_db/repro.

It means we meet some error, but the database create successfully.

But when we try still try to run this query, it will failed:

/**
 * @name just for fun
 */

import cpp
import semmle.code.cpp.dataflow.new.DataFlow
import DataFlow::PathGraph

from Function func
where func.getQualifiedName().matches("%StringToGestureSourceType%")
select func

In my personal views, seems it doesn't link handler file into the database. If u view the AST. u can't got nothing. but normal case(boom.zip) u could view right ast.

image

I attahch the database . you could see the log in the database.

my inverstigate

I inversitigate it a little, maybe I am wrong. but I still offer it maybe it will help u.

First, I suspect maybe it is because of my space is not enough. In my personal views. codeql create database. and zip it. but if it found the space is not enough before zip the database, it won't failed, it just abandon some information. and zip it could zip file.

So I use this command to inverstigate:

watch du sh repro

It shows that the maximium size is 11G, but my available space is 200G. So I don't think is this reason.

Second, I found a similar case in https://github.com/github/codeql/issues/7582. But the problem is fixed. I think this is a similar case... when codeql extract some specific c++ code, it will failed... think about this, If I just recompile input_handler.o it will success, but if I recompile input_handler.o and another file. I will failed. So I think the problem is in the another file. This is too hard for me, but I think If u see the .log file. your could know what happened.

repro.zip

18Fl commented 1 year ago

@jketema hey, I report it at here. thank u!

jketema commented 1 year ago

Thanks for the report. Looking at the build-tracer.log file, which is included with the database, tt seems that some of the extractor processes either crash or get killed. This leaves some incomplete brotli files that can then not be parsed later during database finalisation. It's unfortunately not clear from the logs why this is happening.

18Fl commented 1 year ago

Thanks for the report. Looking at the build-tracer.log file, which is included with the database, tt seems that some of the extractor processes either crash or get killed. This leaves some incomplete brotli files that can then not be parsed later during database finalisation. It's unfortunately not clear from the logs why this is happening.

would u will fix this? and what can I do? thx.

jketema commented 1 year ago

would u will fix this? and what can I do? thx.

This is a bit difficult, as the logs don't provide much evidence. One thing you could maybe help me with to check whether the Windows Event Viewer shows any events that might explain why the extractor processes either crashed or killed.