oracle / opengrok

OpenGrok is a fast and usable source code search and cross reference engine, written in Java
http://oracle.github.io/opengrok/
Other
4.33k stars 746 forks source link

History is hogging web app memory #3541

Open vladak opened 3 years ago

vladak commented 3 years ago

When thinking about #3243, I came to realize that the web app suffers from similar problem: in general whenever a history is requested, the History object representing the complete history of given file/directory is loaded into memory: https://github.com/oracle/opengrok/blob/610d9085dcb52fc16b3f50d16e64cc03a62b7ea1/opengrok-web/src/main/webapp/history.jsp#L81-L84 What's more, it is actually stored in the request: https://github.com/oracle/opengrok/blob/610d9085dcb52fc16b3f50d16e64cc03a62b7ea1/opengrok-web/src/main/webapp/history.jsp#L99 and used for paging.

Even for smaller repositories such as the OpenGrok Git repository this creates significant memory pressure. Here's a graph representing heap stats from a web app where I displayed history for couple of directories (including the top level directory) for indexed OpenGrok Git repository: webapp-history

The Eden space grew by bunch of gigabytes !

vladak commented 3 years ago

The way how the History object is stored in FileHistoryCache makes it hard to do something about this. Firstly, it is compressed, secondly it is XML serialized Java object (#3539). So it has to be read whole into memory and then dissected.

vladak commented 1 year ago

Related to #3539 and #4023

vladak commented 1 year ago

Once a scheme to iterate over history (for both repository method and history cache) without reading all of it in memory is in place, the HistoryReader used during indexing should be converted to this scheme as well.