Closed mlessio closed 1 year ago
@simkoc , @mal-tee : can you help Martino on this? maybe the memory can be easily extended for the java process?
You can adjust the JVM Memory settings in the "php2cpg" bash script in /php-cpg/php2cpg
:
root@ce400482bd53:/php-cpg# cat php2cpg
#!/bin/bash
SCRIPT_ABS_PATH=$(readlink -f "$0")
SCRIPT_ABS_DIR=$(dirname $SCRIPT_ABS_PATH)
JAVA_OPTS='-Xmx20g -Xss30m -XX:+UnlockDiagnosticVMOptions -XX:+ExitOnOutOfMemoryError -XX:AbortVMOnException=java.lang.StackOverflowError' $SCRIPT_ABS_DIR/target/universal/stage/bin/multilayer-php-cpg-generator -- $@
What about a recommended setting? It is already assigning 20G of ram and a 30M stack size, which is quite a big stack. Is there anything else i can do for optimization?
I was wondering whether it was necessary to raise memory for Joern as well, but from what I can see from the error this is only originated by php2cpg
.
@mal-tee : any oither idea? did you experience similar situations on real applications @mlessio : can you share some meta info on the PHP app? total LoC, num of PHP files, ...
@compaluca: sure, here follows the output of the CLOC tool, reporting the LoC number and some other useful data.
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
PHP 309 7345 13294 82814
JavaScript 400 9578 7768 48182
CSS 41 579 403 12401
LESS 70 1160 1521 5472
JSON 9 0 0 4581
HTML 22 350 121 2513
Markdown 15 673 0 2473
SVG 1 0 0 288
Maven 1 1 0 199
Python 1 40 4 140
YAML 2 17 20 77
XML 5 0 0 68
Bourne Shell 1 0 1 14
-------------------------------------------------------------------------------
SUM: 877 19743 23132 159222
-------------------------------------------------------------------------------
As you can see, it is a mid-sized PHP application. The CPG generation was executed on a laptop equipped with 16GB of RAM.
@mal-tee: We have also tried on a real application such as Wordpress (https://github.com/WordPress/WordPress) and we obtained the same OOM error. So, you can consider it as a candidate application to reproduce the issue.
I was wondering whether it was necessary to raise memory for Joern as well, but from what I can see from the error this is only originated by
php2cpg
.@mal-tee : any oither idea? did you experience similar situations on real applications @mlessio : can you share some meta info on the PHP app? total LoC, num of PHP files, ...
Yes, we encounter that on a regular basis. Static analysis is resource hungry. We encounter OOM for large apps even with more RAM (highest I tried for a single app was 128 GB).
One possible workaround is to disable the "Dominator" and "PostDominator" in the /php-cpg/main.conf
by removing the respective lines if the dominator edges aren't used in the testcases. These passes are especially resource intensive.
Hi @mal-tee, we have tried disabling both the Dominator and PostDominator passes, but unfortunately we get the same OOM error. Any other idea?
Unfortunately not, no. Seems like its not possible to convert these apps with the current version.
I don't quite remember if we have dicovery queries that span multiple files, but if we don't we could generate a CPG on a file-level and not for the whole project. But this will be prone to issues of course.
@mal-tee @compaluca update:
it seems that lowering the RAM assigned to the php2cpg JVM (e.g. -Xmx8g) totally fixes this issue, even re-enabling the Dominator and PostDominator passes.
I'll keep you updated!
While trying to execute discovery phase on a real-world PHP Application, the CPG generation ends up with an error message, which seems related to memory consumption on the JVM.
Here follows the discovery execution output: