Heliosearch / heliosearch

The next generation of open source search
http://heliosearch.org
90 stars 19 forks source link

json.facet memory probelem #42

Open esameto opened 9 years ago

esameto commented 9 years ago

I use json.facet for making nested facet statistics, however i found during making a load test using such queries that the memory which is consumed by the SOLR increase by huge rate till about 24 Giga Bytes, here is a sample query:

http://10.68.20.139:5080/solr/reports_core2/select?q={!cache=false}_2110_EXACT_PARTS:[1 TO *]&rows=0&json.facet= {MAN_NAMEANDSRATING: {terms:{field:MAN_NAME,limit:-1,mincount:1,facet: {SRATING:{terms:{field:SRATING,limit:-1,mincount:1,facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}} }}}& json.facet={MAN_NAMEANDRS:{terms:{field:MAN_NAME,limit:-1,mincount:1, facet:{RS:{terms:{field:RS,limit:-1,mincount:1, facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}}& json.facet={MAN_NAMEANDRGRADE:{terms:{field:MAN_NAME,limit:-1,mincount:1, facet:{RGRADE:{terms:{field:RGRADE,limit:-1,mincount:1,facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}} &json.facet={MAN_NAMEANDLC_STATE:{terms:{field:MAN_NAME,limit:-1,mincount:1,facet:{LC_STATE:{terms:{field:LC_STATE,limit:-1,mincount:1, facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}} &json.facet={MAN_NAMEANDP_RANGE:{terms:{field:MAN_NAME,limit:-1,mincount:1, facet:{P_RANGE:{terms:{field:P_RANGE,limit:-1,mincount:1,facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}} &json.facet={MAN_NAMEANDcm_STATUS:{terms:{field:MAN_NAME,limit:-1,mincount:1,facet:{cm_STATUS: {terms:{field:cm_STATUS,limit:-1,mincount:1,facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}}& json.facet={MAN_NAMEANDYEL_RANGE:{terms:{field:MAN_NAME,limit:-1,mincount:1,facet:{YEL_RANGE: {terms:{field:YEL_RANGE,limit:-1,mincount:1,facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}}&facet=true

elyograg commented 9 years ago

Just to be clear ... what are you looking at to determine that this is a problem? I just want to be sure that you are taking java's garbage collection memory model into account. It is completely normal for a Java program to consume all allocated heap memory, at which point it will do a garbage collection to free up memory from objects that are no longer in use.

Extensive faceting on an index with a large number of documents will allocate very large amounts of heap memory, especially on Solr 4.x if DocValues are not used and the facet.method is left at default. You can reduce memory requirements by using facet.method=enum or turning on docValues for all fields you will use for faceting, then doing a complete reindex.

It is always possible that there is a true memory leak, but at the moment, we are not aware of any.

esameto commented 9 years ago

I noticed something may be helpful, I use Heliosearch on Linux on Tomcat web server, so i show the memory consumed by the Tomcat using top command, which says that tomcat consumes about 24 Gb, however when i opened the SOLR home page i found that JVM memory only 4GB, however the Physical Memory is about 30 GB, which means that the memory consumed by the SOLR is taken from the direct system memory not from the this of the JVM, so i think this consumed memory would not be avabilable for the garbage collection. Also i red about Off-Heap cache feature for HelioSearch which make its caching moved off the JVM heap and explicitly managed. Off-heap memory is invisible to the garbage collector.

argakon commented 9 years ago

At first check field cache size in solrconfig.xml. Read about Xms, Xmx, XX:MaxPermSize and Xss Java options, for example in here http://www.mkyong.com/java/find-out-your-java-heap-memory-size/

You need to calculate values of this options for your server. For tomcat(in centos) you can set this options to JAVA_OPTS in /etc/sysconfig/tomcat. I also use G1 garbage collector(Java 8u40).

My settings are: -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:MaxGCPauseMillis=200 -XX:+UseLargePages -XX:+AggressiveOpts -Xms1024M -Xmx8192M -XX:MaxPermSize=256M -Xss512K

My configuration is: 2 Cores, in summary 6921061+3835948 docs Memory on server: 64G Solr RES in top stable at 20.2G