protocolbuffers / protobuf

Protocol Buffers - Google's data interchange format
http://protobuf.dev
Other
65.66k stars 15.5k forks source link

Kotlin compiler crashes out of memory for large protobuf #8732

Closed zakhenry closed 2 years ago

zakhenry commented 3 years ago

I work with a decent sized protobuf definition (68 files, 650 messages, 1875 fields) and am currently using Kotlin gRPC but with the java protobuf api and was excited to switch to the shiny new DSL based Kotlin generated code but alas it crashes out of memory. I've included a minimal reproduction in a separate repo below


What version of protobuf and what language are you using? Version: 3.17.3 Language: Kotlin

What operating system (Linux, Windows, ...) and version? MacOS 11.4 (Intel)

What runtime / compiler are you using (e.g., python version or gcc version) Kotlin 1.5.10

What did you do? Reproduction also outlined in https://github.com/zakhenry/kotlin-protobuf-oom-reproduction

I created a protobuf file with 300 messages, each with 10 string fields, and tried to compile a simple Kotlin main function that uses one of them.

What did you expect to see

It should compile.

What did you see instead?

The compilation step crashed out of memory:

~/repos/kotlin-protobuf-oom-reproduction
➜ ./gradlew build
> Task :compileKotlin FAILEDG [3m 22s]
e: java.lang.OutOfMemoryError: Java heap space

FAILURE: Build failed with an exception.

* What went wrong:
  Execution failed for task ':compileKotlin'.
> Internal compiler error. See log for more details

* Try:
  Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 9m 54s
4 actionable tasks: 4 executed

Anything else we should know about your project / environment

Notably, using just the Java compiled protos in my machine completes the compilation in 13 seconds. See branch https://github.com/zakhenry/kotlin-protobuf-oom-reproduction/tree/java-api-only to try it yourself.

perezd commented 3 years ago

This seems like potentially an issue with Kotlin itself, given where it's crashing, and the evidence that the Java API compiles as you expect. The Kotlin bindings are a wrapper around the Java API, so this feels likely related to Kotlin's compiler itself, and less about Protocol Buffers.

zakhenry commented 3 years ago

Yea I did wonder that, though it could potentially be that there is a particularly inefficient generated code structure being used with the protobuf generated code. I'll post a ticket in the kotlin issue tracker too to see what they say.

perezd commented 3 years ago

I am not super aware of how the system deals with cross-language bindings (Java/Kotlin) so perhaps this is triggering an unfortunate case in the compiler itself?

zakhenry commented 3 years ago

Yep that sounds plausible, unfortunately I'm pretty new to JVM languages so don't really know where to start to dig into myself.

I've raised https://youtrack.jetbrains.com/issue/KT-47270 to correspond with this issue.

Hopefully one of these two projects can identify where the fix needs to happen because it would be a huge shame to miss out on the amazing new DSL that would clean up our protobuf bindings massively

lowasser commented 3 years ago

I also find myself wondering if this could be addressed by a more incremental build: splitting the protos up between more files or more build units.

perezd commented 3 years ago

I wonder if there are JVM flags that could be set to make this perform better?

fowles commented 3 years ago

https://dev.to/martinhaeusler/is-your-kotlin-compiler-slow-here-s-a-potential-fix-4if4 looks like it has something that is worth trying

kolmant commented 3 years ago

Hi, we're facing an issue with the same combination: Kotlin + Protobuf, not with large but a lot of .proto files. In my computer setting XMX for kotlinc worked but it didn't when tried to do the same using a Docker gradle container.

perezd commented 3 years ago

according to Kotlin this has been repro'd on their end: https://youtrack.jetbrains.com/issue/KT-47270

Any additional solutions are blocked on whatever they end up doing.

deannagarcia commented 2 years ago

We are looking into this issue more, but for now the best workaround is to configure a higher memory limit.

dzharkov commented 2 years ago

The root of the issue is that the representation of Java source files is quite ineffective in Kotlin compiler, so when we read huge generated 18 MBs HelloWorld.java it leads to high memory footprint.

image

Potentially, more than 68% of the heap is allocated for the pieces of that file. To be honest, I don't think it's possible to replace the representation soon, so I'd suggested as a workaround if it's possible to compile HelloWorld.java before running kotlinc and then supply it in a form of a class-file. But unfortunately, I'm not sure there is a possibility to have such a workaround via gradle/protobuf infrastructure.

dzharkov commented 2 years ago

BTW, there is another workaround, namely putting this to build.gradle:

compileKotlin {
    kotlinOptions.freeCompilerArgs += ['-Xuse-javac']
}

But this compiler flag (enabling a different internal representation for Java) is still very experimental and unfortunately we can't give any guarantees it would work correctly in every environment.

deannagarcia commented 2 years ago

Closing this since it's being tracked with jetbrains.