Open akr-amd opened 10 months ago
A web search for AttributeError: '_io.TextIOWrapper' object has no attribute 'reconfigure'
suggests this problem can be solved by using Python version 3.7 or higher. It looks like your self-hosted runner has python 3.6 which is pretty old. Could you try upgrading the python version?
Sorry, I forgot to mention that this job runs within a container using a custom image that had Python 3.11. I forgot to include the container part in my workflow file snippet (updated it now)
Maybe running the setup-python-dependencies
as part of the init is causing the issue? I see it's run by default
I'm not sure, but even in a container, the workflow will try to use the python version in the toolcache. Could you try explicitly setting up python 3.11? Something like this, before you run the init
step.
- uses: actions/setup-python@v5
with:
python-version: '3.11'
Now that error is gone but I am presented with a different one
/__w/_tool/CodeQL/2.15.3/x64/codeql/codeql database finalize --finalize-dataset --threads=4 --ram=29890 /__w/_temp/codeql_databases/python
CodeQL detected code written in Python but could not process any of it. Review our troubleshooting guide at https://gh.io/troubleshooting-code-scanning/no-source-code-seen-during-build .
Error: Encountered a fatal error while running "/__w/_tool/CodeQL/2.15.3/x64/codeql/codeql database finalize --finalize-dataset --threads=4 --ram=29890 /__w/_temp/codeql_databases/python". Exit code was 32 and last log line was: CodeQL detected code written in Python but could not process any of it. Review our troubleshooting guide at https://gh.io/troubleshooting-code-scanning/no-source-code-seen-during-build . See the logs for more details.
Hmmm...have you tried turning off setup-python-dependencies
?
Just tried that by setting setup-python-dependencies: false
. Same failure unfortunately 😞
I'm not sure, but even in a container, the workflow will try to use the python version in the toolcache.
@aeisenberg Also, may I ask why is this the case? The TypeScript CodeQL scanner seems happy to use the Node version installed in the container, but the Python scanner doesn't seem to want to use it.
For typescript, no compilation or code execution is required during extraction. For python, we need to execute the python extractor, which is built in python.
@RasmusWL do you have any suggestions on what to do?
@akr-amd The error message you are seeing now is caused by CodeQL not scanning any Python source files in the repository folder.
Could you try running without the ./.github/codeql/codeql-config.yml
configuration file? Perhaps the paths:
are resolved on the host system, so they could be misaligned when running things inside a docker container. Could you print the value of the LGTM_INDEX_FILTERS
environment variable in the workflow and also run a find . -name '*.py'
command.
Another reason could be a mismatch between what CodeQL considers the "source root" and the path of the repository in the container. In that case CodeQL did scan files but would not count them before they are found in an "external" folder.
Could you also re-run the workflow with debug logging enabled? That should result in a CodeQL debug artifact containing much more detailed logs and also any source files that CodeQL has picked up.
Finally, there is no need to run Python or Typescript analysis in a docker container, so you could also try removing the container:
property from the workflow. And if you really want all workflows in the self-hosted runner to run in docker containers you could try using https://github.com/actions/actions-runner-controller or a similar approach.
The suggestions from @aibaars seems solid, let's see if those suggestions solves the problem :+1:
At the outset, thanks for being super responsive and helpful! 🙂
Could you try running without the
./.github/codeql/codeql-config.yml
configuration file? Perhaps thepaths:
are resolved on the host system, so they could be misaligned when running things inside a docker container
Based on feedback, here's how I changed the workflow
./.github/codeql/codeql-config.yml
The new workflow file is below
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"
on:
pull_request:
branches: [ "main" ]
push:
branches: [ "main" ]
jobs:
analyze:
name: Analyze
runs-on: [ self-hosted, Linux ]
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: [ 'javascript', 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
# Use only 'java' to analyze code written in Java, Kotlin or both
# Use only 'javascript' to analyze code written in JavaScript, TypeScript or both
# Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support
steps:
- name: Checkout repository
uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- uses: actions/setup-node@v3
with:
node-version: '16.15.1'
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}
# config-file: ./.github/codeql/codeql-config.yml
setup-python-dependencies: false
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# Details on CodeQL's query packs refer to : https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
# queries: security-extended,security-and-quality
# Autobuild attempts to build any compiled languages (C/C++, C#, Go, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v2
# ℹ️ Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
# If the Autobuild fails above, remove it and uncomment the following three lines.
# modify them (or add more) to build your code if your project, please refer to the EXAMPLE below for guidance.
# - run: |
# echo "Run, Build Application using script"
# ./location_of_script_within_repo/buildscript.sh
- run: |
echo "$(which python)"
echo "$(python -V)"
echo $LGTM_INDEX_FILTERS
find . -name '*.py'
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
with:
category: "/language:${{matrix.language}}"
Without providing the config file, it is able to find all the files. But that's the next problem.
The "Analyze (javascript)"
job is also looking at CodeQL's js files
[2023-12-18 10:44:27] [build-stdout] Extracting /scratch/actions-runner-1/_work/_tool/CodeQL/2.15.4/x64/codeql/javascript/tools/data/externs/lib/jquery-3.2.js
[2023-12-18 10:44:27] [build-stdout] Extracting /scratch/actions-runner-1/_work/_tool/CodeQL/2.15.4/x64/codeql/javascript/tools/data/externs/lib/bdd.js
[2023-12-18 10:44:27] [build-stdout] Extracting /scratch/actions-runner-1/_work/_tool/CodeQL/2.15.4/x64/codeql/javascript/tools/data/externs/lib/vows.js
[2023-12-18 10:44:27] [build-stdout] Extracting /scratch/actions-runner-1/_work/_tool/CodeQL/2.15.4/x64/codeql/javascript/tools/data/externs/lib/should.js
[2023-12-18 10:44:27] [build-stdout] Done extracting /scratch/actions-runner-1/_work/_tool/CodeQL/2.15.4/x64/codeql/javascript/tools/data/externs/lib/bdd.js (11 ms)
[2023-12-18 10:44:27] [build-stdout] Extracting /scratch/actions-runner-1/_work/_tool/CodeQL/2.15.4/x64/codeql/javascript/tools/data/externs/web/w3c_dom4.js
[2023-12-18 10:44:27] [build-stdout] Done extracting /scratch/actions-runner-1/_work/_tool/CodeQL/2.15.4/x64/codeql/javascript/tools/data/externs/web/w3c_dom4.js (7 ms)
[2023-12-18 10:44:27] [build-stdout] Extracting /scratch/actions-runner-1/_work/_tool/CodeQL/2.15.4/x64/codeql/javascript/tools/data/externs/web/ie_event.js
[2023-12-18 10:44:27] [build-stdout] Done extracting /scratch/actions-runner-1/_work/_tool/CodeQL/2.15.4/x64/codeql/javascript/tools/data/externs/lib/should.js (22 ms)
and the "Analyze (python)"
job also scanning the .py
files from the base interpreter
[2023-12-18 10:24:25] [build-stdout] [INFO] [4] Extracted folder /scratch/ghe-runners/1/_work/_tool/Python/3.11.2/x64/lib/python3.11/urllib in 0ms
[2023-12-18 10:24:25] [build-stdout] [INFO] [1] Extracted file /scratch/ghe-runners/1/_work/_tool/Python/3.11.2/x64/lib/python3.11/datetime.py in 1546ms
[2023-12-18 10:24:25] [build-stdout] [INFO] [1] Extracted file /scratch/ghe-runners/1/_work/_tool/Python/3.11.2/x64/lib/python3.11/copy.py in 191ms
[2023-12-18 10:24:25] [build-stdout] [INFO] [1] Extracted file /scratch/ghe-runners/1/_work/_tool/Python/3.11.2/x64/lib/python3.11/json/__init__.py in 92ms
[2023-12-18 10:24:25] [build-stdout] [INFO] [1] Extracted file /scratch/ghe-runners/1/_work/_tool/Python/3.11.2/x64/lib/python3.11/unittest/signals.py in 40ms
[2023-12-18 10:24:25] [build-stdout] [INFO] [2] Extracted file /scratch/ghe-runners/1/_work/_tool/Python/3.11.2/x64/lib/python3.11/zipfile.py in 1537ms
Could you print the value of the
LGTM_INDEX_FILTERS
environment variable in the workflow
Not sure if I am doing something wrong, but the value of LGTM_INDEX_FILTERS
is ""
also run a
find . -name '*.py'
command
Here's the output of find . -name '*.py'
-- it does list all the python files from the repo
./cli/cli/__init__.py
./cli/cli/_internal/__init__.py
./cli/cli/_internal/_open_target_wiz.py
./cli/cli/_internal/_server_management.py
./cli/cli/_internal/event_handlers.py
./cli/cli/_internal/task_tracker.py
./cli/cli/_internal/tasks.py
./cli/cli/cable.py
./cli/cli/commands.py
./cli/cli/common.py
.
.
.
.
.
.
.
./engine/.vscode/pydevd/pydevd_plugins/__init__.py
./engine/.vscode/pydevd/pydevd_plugins/extensions/__init__.py
./engine/.vscode/pydevd/pydevd_plugins/extensions/pydevd_plugin_chipscopy.py
./engine/engine/__init__.py
./engine/engine/exceptions.py
./engine/engine/interceptor.py
./engine/engine/main.py
./engine/engine/protobuf/__init__.py
./engine/engine/protobuf/generate.py
./engine/engine/runtime/__init__.py
.
.
.
Finally, there is no need to run Python or Typescript analysis in a docker container, so you could also try removing the
container:
property from the workflow. And if you really want all workflows in the self-hosted runner to run in docker containers you could try using https://github.com/actions/actions-runner-controller or a similar approach.
May I ask why the analysis shouldn't be run in a container? I was coming at it from the angle of 'pristine env of a container won't cause any env unintended cross-contamination issues' The ARC seems like a neat thing, but I can't implement that myself -- our DevOps teams would have to implement it, which I don't think will happen right away.
Without providing the config file, it is able to find all the files. But that's the next problem. The "Analyze (javascript)" job is also looking at CodeQL's js files and the "Analyze (python)" job also scanning the .py files from the base interpreter
That is actually not a problem, but expected behaviour. Those JavaScript files from CodeQL contain stubs for JS functions that are available by default in (browser) environments. The Python analysis also scans the standard libraries to figure out dataflow through standard functions. CodeQL can stop doing that, once the team has implemented QL models for the standard libraries, but for now it is still needed.
Not sure if I am doing something wrong, but the value of LGTM_INDEX_FILTERS is ""
Ah yes sorry, I should have made clear that I was interested in the value lf LGTM_INDEX_FILTERS
before removing the configuration file. Internally, the CodeQL interprets the paths:/paths-ignore:
settings and puts them into that environment variable. I was hoping to see if there is a mismatch between the paths mentioned in the LGTM_INDEX_FILTERS
and the paths reported by find
.
May I ask why the analysis shouldn't be run in a container? I was coming at it from the angle of 'pristine env of a container won't cause any env unintended cross-contamination issues'
I did not mean to say that you shouldn't run in a container. Having a "pristine" env can be quite beneficial. It's just that JS and Python analysis don't change the environment and should work fine in a not so "pristine" environment. The reason why I asked to run outside a container was to reduce complexity. When running in a container there is always the risk of mixing up file paths from the host or the container. Things should have worked fine with a configuration file and a container, but they didn't ;-) To debug it helps to disable those features to see which one causes the problem, or whether it is the combination of both.
That is actually not a problem, but expected behaviour. Those JavaScript files from CodeQL contain stubs for JS functions that are available by default in (browser) environments. The Python analysis also scans the standard libraries to figure out dataflow through standard functions. CodeQL can stop doing that, once the team has implemented QL models for the standard libraries, but for now it is still needed.
Oh ok got it. So, seems it's safe to ignore these.
Ah yes sorry, I should have made clear that I was interested in the value lf
LGTM_INDEX_FILTERS
before removing the configuration file. Internally, the CodeQL interprets thepaths:/paths-ignore:
settings and puts them into that environment variable. I was hoping to see if there is a mismatch between the paths mentioned in theLGTM_INDEX_FILTERS
and the paths reported byfind
.
Hmm, I used the config file (shown below) and re-ran the workflow, but LGTM_INDEX_FILTERS
is still empty. Any other things you would like me to try?
paths:
- 'engine/**/*.py'
- 'cli/**/*.py'
- 'ui/**/*.ts'
I did not mean to say that you shouldn't run in a container. Having a "pristine" env can be quite beneficial. It's just that JS and Python analysis don't change the environment and should work fine in a not so "pristine" environment. The reason why I asked to run outside a container was to reduce complexity. When running in a container there is always the risk of mixing up file paths from the host or the container. Things should have worked fine with a configuration file and a container, but they didn't ;-) To debug it helps to disable those features to see which one causes the problem, or whether it is the combination of both.
Makes sense - Eliminate possible sources of problem
I could reproduce, and narrowed down the problem to cases where paths
contains globs 😬 As a workaround, can you please use this config file?
paths:
- 'engine/'
- 'cli/'
- 'ui/'
(if analyzing .py
files inside ui
is a problem, you can use paths-ignore
property to exclude those with something like ui/**/*.py
)
I could reproduce, and narrowed down the problem to cases where
paths
contains globs 😬 As a workaround, can you please use this config file?paths: - 'engine/' - 'cli/' - 'ui/'
Sorry, I thought I replied back. The suggested config file does work @RasmusWL. Thanks!
👋 Team,
I'm also facing similar issue while running codeql 2.15.3 version . Note I'm only testing a simple python codeql query . For other languages java/javascript same command is working fine.
Command
/home/bakul/codeql/codeql test run UrlRedirect.ql --show-extractor-output
Output
Executing 1 tests in 1 directories. Extracting test database in /home/bakul/codeql/python/ql/src/experimental/security/url-redirect. [2024-01-16 00:29:46] [build-err] Process ForkProcess-1: [2024-01-16 00:29:46] [build-err] Traceback (most recent call last): [2024-01-16 00:29:46] [build-err] File "/usr/lib64/python3.6/multiprocessing/process.py", line 258, in _bootstrap [2024-01-16 00:29:46] [build-err] self.run() [2024-01-16 00:29:46] [build-err] File "/usr/lib64/python3.6/multiprocessing/process.py", line 93, in run [2024-01-16 00:29:46] [build-err] self._target(self._args, self._kwargs) [2024-01-16 00:29:46] [build-err] File "/home/bakul/codeql/python/tools/python3src.zip/semmle/logging.py", line 116, in _message_loop [2024-01-16 00:29:46] [build-err] sys.stdout.reconfigure(encoding='utf-8') [2024-01-16 00:29:46] [build-err] AttributeError: '_io.TextIOWrapper' object has no attribute 'reconfigure' [2024-01-16 00:29:46] [build-err] Traceback (most recent call last): [2024-01-16 00:29:46] [build-err] File "/home/bakul/codeql/python/tools/python_tracer.py", line 53, in
[2024-01-16 00:29:46] [build-err] semmle.populator.main(original_path) [2024-01-16 00:29:46] [build-err] File "/home/bakul/codeql/python/tools/python3src.zip/semmle/populator.py", line 43, in main [2024-01-16 00:29:46] [build-err] AttributeError: '_io.TextIOWrapper' object has no attribute 'reconfigure' [2024-01-16 00:29:46] [ERROR] Spawned process exited abnormally (code 1; tried to run: [python3, /home/bakul/codeql/python/tools/python_tracer.py, --lang=3, --filter=exclude: /.testproj/**, --path, /home/bakul/codeql/python/ql/src/experimental/security/url-redirect, --verbosity, 3, --colorize]) Could not extract a dataset in /home/bakul/codeql/python/ql/src/experimental/security/url-redirect: Extraction command python3 failed with status 1. Extraction command python3 failed with status 1. [1/1] FAILED(EXTRACTION) /home/bakul/codeql/python/ql/src/experimental/security/url-redirect/UrlRedirect.ql Compiling queries in /home/bakul/codeql/python/ql/src/experimental/security/url-redirect. Completed in 4.5s (extract 1.2s comp 0ms eval 0ms). 0 tests passed; 1 tests failed: FAILED: /home/bakul/codeql/python/ql/src/experimental/security/url-redirect/UrlRedirect.ql`
Hi @BullHacks3, as already mentioned in this issue:
A web search for AttributeError: '_io.TextIOWrapper' object has no attribute 'reconfigure' suggests this problem can be solved by using Python version 3.7 or higher. It looks like your self-hosted runner has python 3.6 which is pretty old. Could you try upgrading the python version?
Hey @RasmusWL I'm already using python 3.11 version .Thanks
@BullHacks3 please open a new issue then.
Thanks @RasmusWL , created new issue for same : https://github.com/github/codeql/issues/15337. Thanks
Hi there, I am trying to setup CodeQL analysis on a repo in our github enterprise server. This is a monorepo with TypeScript and Python code. The directory structure is like so
While analyzing the Python code, the CodeQL action fails with below error. Could you please help me figure out what I might be doing wrong?
codeql.yml
codeql-config.yml