Open smerchek opened 10 years ago
On Thu, Oct 16, 2014 at 3:02 PM, Scott Smerchek notifications@github.com wrote:
It appears this change was made uniquely for heliosearch as part of the following commit. As we recently upgraded from Solr to Heliosearch, this was a surprise for us. We rely on the full list of parameters to exist in our logs so we get the full context of whats going on. In addition, we use tools like logstash to parse the logs and get a picture of what parameters are being used over time.
c7ca813#diff-06fb971739c227bdf836946417f4da9cR243 https://github.com/Heliosearch/heliosearch/commit/c7ca813b91a3d28c8ee38052f26598ffde368071#diff-06fb971739c227bdf836946417f4da9cR243
Was there a compelling reasons to do this?
The reason was when I was doing the terms query implementation. For most people it's useless echoing back megabytes of input and takes a lot of processing/IO.
If so, would it be possible to make this configurable?
Certainly. How about a logLimit request parameter (or can you think of a better name)? You could set that as a default parameter on the request handler so it could be varied per-source or per-request even. That seems more flexible than a global config via solrconfig.xml...
-Yonik
I can see how 1000s of terms being printed in the logs would be overkill. We would even probably ignore those, too.
Yes, I like the idea of a logLimit
parameter. On the other hand, it's a little odd to affect logging through a request parameter. But I can definitely see where you would want more flexibility than just the solrconfig.
As for the truncating part, it should be careful to maintain the log format. For example, a non truncated log looks like
&rows=1} hits=184 status=0 QTime=25
but the truncated log looks like
&ro[...921 more chars]status=0 QTime=1
It should finish the }
and space between status=
, looking like:
&ro[...921 more chars]} status=0 QTime=1
OK, I've committed support for a request parameter called "logLimit" that limits the size of any given request parameter. This way, at least the start of all request parameters are still displayed. This also fixes the parse-ability issues you highlighted.
Example:
curl "http://localhost:8983/solr/select?q=*:*&a=hello&b=this+is+a+big+string&logLimit=10"
And the log output is:
639632 [qtp65374112-12] INFO org.apache.solr.core.SolrCore – [collection1] webapp=/solr path=/select params={logLimit=10&b=this+is+a+[...10+more+chars]&q=*:*&a=hello} hits=32
I also optimized the logging in a number of ways, including avoiding multiple string creations (just to pre-pend a "{" for example), and avoiding doing any work whatsoever if it won't be logged anyway.
A logLimit of -1 means "don't limit".
Did you accidentally remove 'status' and 'QTime' from the logs with this commit? Because they are no longer contained in the log output, only 'hits' is present.
Hmmm, you're right... looks like they were accidentally removed.
It appears this change was made uniquely for heliosearch as part of the following commit. As we recently upgraded from Solr to Heliosearch, this was a surprise for us. We rely on the full list of parameters to exist in our logs so we get the full context of whats going on. In addition, we use tools like logstash to parse the logs and get a picture of what parameters are being used over time.
https://github.com/Heliosearch/heliosearch/commit/c7ca813b91a3d28c8ee38052f26598ffde368071#diff-06fb971739c227bdf836946417f4da9cR243
Was there a compelling reasons to do this? If so, would it be possible to make this configurable?