marklogic-community / marklogic-samplestack

A sample implementation of the MarkLogic Reference Architecture
Apache License 2.0
82 stars 56 forks source link

date buckets and shadows #404

Open laurelnaiad opened 9 years ago

laurelnaiad commented 9 years ago

see below for update - https://github.com/marklogic/marklogic-samplestack/issues/404#issuecomment-73375873

@grechaw and I spoke the other day about date buckets. We've both noticed that the monthly buckets that are hard-coded in right now seem to be resulting in decent performance as well as a decent user experience.

Our conclusion at that time was to thus propose that we "skip" the implementation of alternative bucket sizes (yearly, daily)....

Is this something deferrable for 8.0-1?

On a somewhat related note, I want to raise another question. We have talked in the past about consolidating our "shadow queries" such that one call to the middle tier would effect the searching of both the "main" search and the "side" searches that calculate the alternate facet values that we use. We use the term "shadow query" to mean "another version of the query that drop the related criteria from the query in order to get a larger count of what is present if that criteria are omitted. It is used to display the "outer" date range, and in aspects of the tags facet presentation, as well.

Presently, the browser makes three separate search requests every time the criteria are modified, one main search and two shadows (one for the date range and one for the tags).

I have floated this proposal before -- and the reason it comes up in this context is that @grechaw and I talked about addressing the alternate bucket sizes in the same effort as the following....

It would be a nice performance improvment for the browser to call a single /v1/search endpoint which was capable of taking additional instructions through he POST body in order to execute the shadow queries simultaenously with the main query, and to combine the results in the middle tier answering with one response, rather than three.

What we have now is functional, so there is a case to be made for leaving it as is, for now. But I want to just get this one the record, at least, if not consider the possibility of doing it for 8.0-1. The code the browser uses to figure out how to do the shadow queries is available and would probably be a relatively straightforward port to Java, but it is work, and we may or may not have time.

//cc @grechaw @popzip

yawitz commented 9 years ago

I think the date histogram suffers (from a UX perspective) when charting results that fall within a very narrow or very wide date range. (That's the problem the adaptive bucket sizes is meant to address.) That said, I wouldn't want to drop that design feature, but would be OK to defer to the next release to meet a schedule constraint.

Regarding the shadow queries, I should mention that tag shadows are only needed for the "related tags" feature; the tag filter sidebar does not make use of shadow queries (since a selection of a sidebar tag "ANDs" with the current selection, rather than replace it). As a general rule, shadow queries are only needed where a new selection replaces an existing selection (as it does for chart point or range selection, or for related tags where selecting one replaces the current tag filter selections).

popzip commented 9 years ago

Would like to discuss... is this issue to track both questions - whether to defer dynamic buckets and how to handle shadow queries (related but distinct)?

What is the current proposal for how to do dynamic buckets and the challenges with that approach? Does this blog post apply at all (can define buckets at run time) http://developer.marklogic.com/blog/ranged-buckets-udf? Are we using QBE?

grechaw commented 9 years ago

We're not using QBE. That blog post is cool, not the approach we were taking. @dmcassel found a rather more ninja, and XQuery-only for now- way to implement dynamic buckets (custom Search API extension)

popzip commented 9 years ago

Okay thanks for clarifying, don't want to confuse things. still want to talk a bit more about bucketing - i think it's a matter of when not whether we address. Will add to agenda for tomorrow triage.

popzip commented 9 years ago

Dynamic date bucketing in the browser postponed after 8.0-1. I'll mark this as post 8.0-1 task.

laurelnaiad commented 9 years ago

Converting to RFE to change /v1/search API to support shadow searches in same request as "main" search.

yawitz commented 9 years ago

Can we move the dynamic bucketing issue to a separate issue for 8.0-2? That one is still a task to be addressed, not an RFE (as per Kasey's comment above, and the design captured in the wireframes).

laurelnaiad commented 9 years ago

Let's discuss before we create extra issues. Tackling the feature without the API change raises performance/complexity issues and so I think it's best we try to do them together. I made it an RFE so that we could discuss landing time and resources.