apache / incubator-heron

Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter
https://heron.apache.org/
Apache License 2.0
3.64k stars 597 forks source link

Uploader implementations may leak storage when resubmitting the same analytic multiple times #3769

Closed nicknezis closed 2 years ago

nicknezis commented 2 years ago

Describe the bug The Uploader logic in Heron adds a Random Long to the filename that causes memory leak in Bookkeeper. The Uploader filename generation logic is here.

And the Dlog config that disables time and size based retention is here.

Eventually the Bookkeeper log runs out of space and no more topologies can be submitted.

The proposed fix is to remove the Random.long that is being added to the filename.

surahman commented 2 years ago

Would using RandomStringUtils#random with a character count of 11 fix the issue? Or is this a string length related issue?

If my math is correct it should generate 26 uppercase + 26 lowercase + 10 digits permute 11 count = 62 p 11 = 390,164,706,723,052,800 potential strings. That is more options than Random#long provides.

nicknezis commented 2 years ago

So my suggestion is to remove the Random.long all together. I'm not sure why this was added, but we don't have any ability to "roll-back" to the previous uploads. So having an endless series of entries in Bookkeeper doesn't make sense. If I resubmit the same topology with name, role, version coordinates, I would expect it to replace the previous entry in Bookkeeper (or whatever upload mechanism is used). For example when developing and testing an analytic job.

If the desire is to run different binaries, one should distinguish the name, role and/or version to create a unique coordinate in the Upload/Download mechanism.

We will have to test the Upload functionality to ensure that removing the Random.long doesn't break anything.

windhamwong commented 2 years ago

Is that possible to, either defined by user, or set it with timestamp instead?