Open nirmal070125 opened 1 month ago
Can we get update on this? This issue is a blocker for healthcare use cases when using Bal 9.
Can we get update on this? This issue is a blocker for healthcare use cases when using Bal 9.
@sameeragunarathne we are discussing the resolution for this. Will provide an update as soon as possible.
The main root cause of this issue is that when compiling multiple modules, we keep the blangpackage
related to the compiled package within memory. As the number of modules increases, this causes an out-of-memory (OOM) error. We retain the blangpackage
of the compiled module because it is needed for the test command. As a solution for now, we are going to remove some unnecessary closures that are generated for module-level annotations to reduce the size of blangpackage.
To add to the findings of @chiranSachintha, we clean the bLangPackages
of the dependencies after the code generation. We cannot extend this to clean the modules of the user's package since in some commands like bal doc
and bal test
, we access the bLangPackage
after the code generation phase. We could do extensive refactoring to improve this to be able to clean the bLangPackage
instances for the user's package also, but IMO, we could only put off the OOM with it. The issue will surface again when the package keeps growing.
If the fix from @chiranSachintha resolves it for now, we can go ahead and do a patch release. At the same time, we would need to work on a proper optimization.
When compiling with a clean central cache, we are experiencing an OOM issue with the Ballerina compiler. This issue doesn’t occur during subsequent compilations. The root cause is that, in the initial compilation, all direct and indirect dependencies of the current package are compiled, creating BIR and Jar files for each dependency package in the cache.
Currently, the compiler driver compiles all these dependencies within the same process, which likely leads to unnecessary memory usage.
From the second compilation onwards, the compiler driver reads the BIR of package dependencies instead of recompiling them.
To address this, what if we make the first compilation behave like the subsequent ones, where it reads the BIR of package dependencies instead of compiling them from scratch? We can achieve this by creating a new OS process to compile each package dependency. Once the process finishes, the compiler driver can read the BIR of that package. This approach should technically resolve the issue.
We improved the memory consumption with https://github.com/ballerina-platform/ballerina-lang/pull/43009. Now ballerinax/health.hl7v2commons
package compiles and generates the executable successfully. The first compilation takes about 6 minutes (all healthcare dependencies are compiled from the sources). The subsequent compilations take ~7 seconds.
However, some of the dependencies are consuming closer to 1GB of memory even if it is compiled in a separate process.
Below is a summary of the memory consumption after the fix. If the packages keep growing, then we might experience OOM again soon.
Memory consumed by the main process ~= 250 MB
Healthcare dependency |
Memory consumption approx. (x) |
Total memory consumed by bal build (x + 250MB) |
---|---|---|
ballerinax/health.hl7v2:2.2.1 | 73 MB | 323 MB |
ballerinax/health.hl7v231:3.0.1 | 550 MB | 800 MB |
ballerinax/health.hl7v23:3.0.2 | 500 MB | 750 MB |
ballerinax/health.hl7v24:3.0.1 | 550 MB | 800 MB |
ballerinax/health.hl7v251:3.0.1 | 750 MB | 1000 MB |
ballerinax/health.hl7v25:3.0.1 | 800 MB | 1050 MB |
ballerinax/health.hl7v26:3.0.1 | 850 MB | 1100 MB |
ballerinax/health.hl7v27:3.0.1 | 990 MB | 1240 MB |
ballerinax/health.hl7v28:3.0.1 | 980 MB | 1230 MB |
My recommendation is to expose this feature with a compiler option, as discussed. Let's mark it as experimental initially.
--optimize-dependency-compilation
[EXPERIMENTAL] Enables memory-efficient compilation of package dependencies using separate processes. This can help prevent out-of-memory issues during initial compilation with a clean central cache.
@nirmal070125 @sameeragunarathne This improvement will be released with 2201.9.2 which is estimated to be released this week. As discussed in https://github.com/ballerina-platform/ballerina-lang/issues/42860#issuecomment-2190795584, this optimization in the compilation is not enabled by default. This can be enabled by passing the --optimize-dependency-compilation
flag to the build or adding the build option in the Ballerina.toml
file as shown below.
[build-options]
optimizeDependencyCompilation = true
@sameeragunarathne and I tested this by publishing the BallerinaX modules to the local cache and then trying to build the commons
module using the locally published modules. However, I am encountering an error: failed to compile ballerinax/health.hl7v23:3.0.4
. When I try to access the BallerinaX modules through the central repository, it works fine. @azinneera
Description
https://github.com/ballerina-platform/module-ballerinax-health.hl7v2/tree/main/commons - update the ballerina version and remove those explicit dependencies from the
Ballerina.toml
.bal build
gives following error;Eclipse memory analyser shows 2 problems;
1)
2)
947,983 instances of org.wso2.ballerinalang.compiler.diagnostic.BLangDiagnosticLocation, loaded by jdk.internal.loader.ClassLoaders$AppClassLoader @ 0x7c07390d0 occupy 113,757,984 (10.59%) bytes.
Steps to Reproduce
No response
Affected Version(s)
2201.9.0
OS, DB, other environment details and versions
No response
Related area
-> Compilation
Related issue(s) (optional)
No response
Suggested label(s) (optional)
No response
Suggested assignee(s) (optional)
No response