Open Vorticity-Flux opened 7 years ago
I was able to reproduce the speedup of the faceted query for yz_solr:partition_list/1
. Switching to facet.method=enum
yields consistently faster results.
Additional optimisations include not asking for actual query results (which may generate substantial amounts of unused data internally for big documents) and not returning query headers with the result. Every little helps, as they say.
yz_solr:partition_list/1 function performs Solr lookup using facet search. https://github.com/basho/yokozuna/blob/02682312031a1935a5b0dcd51d6cb88e6718d0bd/src/yz_solr.erl#L280
For some reason in our setup it was observed that this Solr query takes 10 seconds to complete. It is likely that this query is waiting for commit to complete and/or new searcher to finish opening. (Details about observed poor Solr performance and reasons are given in issue https://github.com/basho/yokozuna/issues/719 ).
After consulting Solr IRC it was established that facet.method=enum resolves this aspect of our performance problems. With this parameter yz_solr:partition_list/1 always completes in under 10ms (1000 times speed up!).
For now we have modified solr_config.xml and set the default facet.method to enum. However as far as I understand this is not a reliable solution (as solr_config.xml is overwritten in some circumstances(?)).
I think it is worthwhile to do one of the following: a) Add a way to add facet.method=enum to the Solr partition list facet query. It seems to perform much faster then the default facet method. b) Turn Solr docValues on for the _yz_pn field. This will should make faceting on this field really fast in all cases. This could be added to the default schema.