marklogic-community / marklogic-samplestack

A sample implementation of the MarkLogic Reference Architecture
Apache License 2.0
82 stars 56 forks source link

Fix snippets #371

Closed popzip closed 9 years ago

popzip commented 9 years ago

Snippets are too long (too many characters/numbers of snippets in one result) and they repeat.

grechaw commented 9 years ago

Verified. This looks like a bug in the search-response-transform. Might be able to identify and move on this quickly -- or this might be a good time to ditch the XQiuery transform and finish out work on the JS transform.

grechaw commented 9 years ago

I've got the controls for max-matches, but still the output is wrong browser-side. I'll try to characterize what I've fixed and what I'm still seeing in the UI.

yawitz commented 9 years ago

Charles, once you feel you have control over this, please ping me so we can tune the snippet count and length.

grechaw commented 9 years ago

Perfect. Stu is merging my most recent work with the skinning stuff. The snippets look pretty good from the middle tier right now -- I think that there's a bug to fix in the front end (l'll use this issue to diagnose and pass) and then we can work with config as needed.

laurelnaiad commented 9 years ago

I think there's a middle-tier problem.

This search endpoint request body:

{"query":{"qtext":["test","sort:relevance"],"and-query":{"queries":[{"range-constraint-query":{"constraint-name":"lastActivity","value":"2012-09-01T00:00:00.000","range-operator":"GE"}},{"range-constraint-query":{"constraint-name":"lastActivity","value":"2012-10-01T00:00:00.000","range-operator":"LT"}}]}},"start":1}

Is yielding these snippets (note I've trimmed out the most of the response to focus on the snippets, these are the snippets from the first search result):

        "snippets":[
          {
            "match-text":[
              "...clarify a possible misunderstanding:\n\tfunction ",
              {
                "highlight":"test"
              },
              "(){\n\t    var value=true;... ...function without `new`:\n\tfunction ",
              {
                "highlight":"test"
              },
              "() {\n\t    var value =... ...value;\n\t}\n\t\n\tvar r = ",
              {
                "highlight":"test"
              },
              "();\n\t\nIf you really... The reason why you got `",
              {
                "highlight":"test"
              },
              " {}` as result was... ...operators.\nFor example:\n\tfunction ",
              {
                "highlight":"Test"
              },
              "(){\n\t    var value =... ...}\n\t\n\tvar t = new ",
              {
                "highlight":"Test"
              },
              "();\n\tconsole.log(t)... ...t);          // logs the ",
              {
                "highlight":"Test"
              },
              " instance\n\tconsole.log(t =..."
            ],
            "source":"answer",
            "id":"soa12248946"
          },
          {
            "match-text":[
              "...clarify a possible misunderstanding:\n\tfunction ",
              {
                "highlight":"test"
              },
              "(){\n\t    var value=true;... ...function without `new`:\n\tfunction ",
              {
                "highlight":"test"
              },
              "() {\n\t    var value =... ...value;\n\t}\n\t\n\tvar r = ",
              {
                "highlight":"test"
              },
              "();\n\t\nIf you really... The reason why you got `",
              {
                "highlight":"test"
              },
              " {}` as result was... ...operators.\nFor example:\n\tfunction ",
              {
                "highlight":"Test"
              },
              "(){\n\t    var value =... ...}\n\t\n\tvar t = new ",
              {
                "highlight":"Test"
              },
              "();\n\tconsole.log(t)... ...t);          // logs the ",
              {
                "highlight":"Test"
              },
              " instance\n\tconsole.log(t =..."
            ],
            "source":"answer",
            "id":"soa12248946"
          },
          {
            "match-text":[
              "...clarify a possible misunderstanding:\n\tfunction ",
              {
                "highlight":"test"
              },
              "(){\n\t    var value=true;... ...function without `new`:\n\tfunction ",
              {
                "highlight":"test"
              },
              "() {\n\t    var value =... ...value;\n\t}\n\t\n\tvar r = ",
              {
                "highlight":"test"
              },
              "();\n\t\nIf you really... The reason why you got `",
              {
                "highlight":"test"
              },
              " {}` as result was... ...operators.\nFor example:\n\tfunction ",
              {
                "highlight":"Test"
              },
              "(){\n\t    var value =... ...}\n\t\n\tvar t = new ",
              {
                "highlight":"Test"
              },
              "();\n\tconsole.log(t)... ...t);          // logs the ",
              {
                "highlight":"Test"
              },
              " instance\n\tconsole.log(t =..."
            ],
            "source":"answer",
            "id":"soa12248946"
          },
          {
            "match-text":[
              "...clarify a possible misunderstanding:\n\tfunction ",
              {
                "highlight":"test"
              },
              "(){\n\t    var value=true;... ...function without `new`:\n\tfunction ",
              {
                "highlight":"test"
              },
              "() {\n\t    var value =... ...value;\n\t}\n\t\n\tvar r = ",
              {
                "highlight":"test"
              },
              "();\n\t\nIf you really... The reason why you got `",
              {
                "highlight":"test"
              },
              " {}` as result was... ...operators.\nFor example:\n\tfunction ",
              {
                "highlight":"Test"
              },
              "(){\n\t    var value =... ...}\n\t\n\tvar t = new ",
              {
                "highlight":"Test"
              },
              "();\n\tconsole.log(t)... ...t);          // logs the ",
              {
                "highlight":"Test"
              },
              " instance\n\tconsole.log(t =..."
            ],
            "source":"answer",
            "id":"soa12248946"
          },
          {
            "match-text":[
              "...clarify a possible misunderstanding:\n\tfunction ",
              {
                "highlight":"test"
              },
              "(){\n\t    var value=true;... ...function without `new`:\n\tfunction ",
              {
                "highlight":"test"
              },
              "() {\n\t    var value =... ...value;\n\t}\n\t\n\tvar r = ",
              {
                "highlight":"test"
              },
              "();\n\t\nIf you really... The reason why you got `",
              {
                "highlight":"test"
              },
              " {}` as result was... ...operators.\nFor example:\n\tfunction ",
              {
                "highlight":"Test"
              },
              "(){\n\t    var value =... ...}\n\t\n\tvar t = new ",
              {
                "highlight":"Test"
              },
              "();\n\tconsole.log(t)... ...t);          // logs the ",
              {
                "highlight":"Test"
              },
              " instance\n\tconsole.log(t =..."
            ],
            "source":"answer",
            "id":"soa12248946"
          },
          {
            "match-text":[
              "...clarify a possible misunderstanding:\n\tfunction ",
              {
                "highlight":"test"
              },
              "(){\n\t    var value=true;... ...function without `new`:\n\tfunction ",
              {
                "highlight":"test"
              },
              "() {\n\t    var value =... ...value;\n\t}\n\t\n\tvar r = ",
              {
                "highlight":"test"
              },
              "();\n\t\nIf you really... The reason why you got `",
              {
                "highlight":"test"
              },
              " {}` as result was... ...operators.\nFor example:\n\tfunction ",
              {
                "highlight":"Test"
              },
              "(){\n\t    var value =... ...}\n\t\n\tvar t = new ",
              {
                "highlight":"Test"
              },
              "();\n\tconsole.log(t)... ...t);          // logs the ",
              {
                "highlight":"Test"
              },
              " instance\n\tconsole.log(t =..."
            ],
            "source":"answer",
            "id":"soa12248946"
          },
          {
            "match-text":[
              "...clarify a possible misunderstanding:\n\tfunction ",
              {
                "highlight":"test"
              },
              "(){\n\t    var value=true;... ...function without `new`:\n\tfunction ",
              {
                "highlight":"test"
              },
              "() {\n\t    var value =... ...value;\n\t}\n\t\n\tvar r = ",
              {
                "highlight":"test"
              },
              "();\n\t\nIf you really... The reason why you got `",
              {
                "highlight":"test"
              },
              " {}` as result was... ...operators.\nFor example:\n\tfunction ",
              {
                "highlight":"Test"
              },
              "(){\n\t    var value =... ...}\n\t\n\tvar t = new ",
              {
                "highlight":"Test"
              },
              "();\n\tconsole.log(t)... ...t);          // logs the ",
              {
                "highlight":"Test"
              },
              " instance\n\tconsole.log(t =..."
            ],
            "source":"answer",
            "id":"soa12248946"
          },
          {
            "match-text":[
              "...to do something like: \n\tfunction ",
              {
                "highlight":"test"
              },
              "(){\n\t    this.value =... ...and then \n\tvar r=new ",
              {
                "highlight":"test"
              },
              "();\n\tif (r.value..."
            ],
            "source":"answer",
            "id":"soa12248817"
          },
          {
            "match-text":[
              "...to do something like: \n\tfunction ",
              {
                "highlight":"test"
              },
              "(){\n\t    this.value =... ...and then \n\tvar r=new ",
              {
                "highlight":"test"
              },
              "();\n\tif (r.value..."
            ],
            "source":"answer",
            "id":"soa12248817"
          },
          {
            "match-text":[
              "...make it like this:\nvar ",
              {
                "highlight":"test"
              },
              " = (function(){\n var..."
            ],
            "source":"answer",
            "id":"soa12248817"
          },
          {
            "match-text":[
              "...with one private variable:\n\tfunction ",
              {
                "highlight":"test"
              },
              "(){\n\tvar value=true;..."
            ],
            "source":"question",
            "id":"soq12248762"
          }
        ],
grechaw commented 9 years ago

Yep, I've got reproduction too.

grechaw commented 9 years ago

OK, this goes all the way up into the Search API. There's some extra work going on in snippeting JSON that shouldn't be. I think that once the extra results are pruned from that payload, things will just look better. Once I'm working off HEAD again I'll be able to verify that. In the meantime, this needs to be set to external unfortunately.

grechaw commented 9 years ago

Search API bug number coming shortly.

grechaw commented 9 years ago

https://bugtrack.marklogic.com/31334

grechaw commented 9 years ago

OK now I have samplestack running on HEAD and can get this external bug fixed. It's high priority to get that one in, even though we can't fix it for samplestack until 8.0

laurelnaiad commented 9 years ago

Make sure you isolate anything that won't work with EA3 into a separate branch or branches. We can't let develop drift into that territory because we need it to be able to serve as the basis for hotfixes until customers have access to a newer version (i.e. 8.0-1) of the server.

grechaw commented 9 years ago

Yes, I've got that wall in place, thanks for the reminder.

grechaw commented 9 years ago

I have a patch in review for server bug.

grechaw commented 9 years ago

Server bug was fixed. The immediate issue with snippets was therefore done. I still need to sit with Mitch and see whether or not the snippets are ready. I'll keep this issue active pending some more review of snippets.

popzip commented 9 years ago

@yawitz I believe you and i briefly discussed - feel free to sync with @grechaw directly this week to fine tune snippet construction.

yawitz commented 9 years ago

Already mentioned to @grechaw. Charles, are you ready to tune the snippet count and length, and can we do this while you're here?

grechaw commented 9 years ago

@yawitz and I met to day to fine-tune some snippet settings. The structure (or lack thereof) in our SO documents make some of the snippets non-ideal, but we got to a commit that reflects Mitch's preferences. When these changes go into PR and develop, I'll ship this bug.

popzip commented 9 years ago

@grechaw and @yawitz thanks for muscling through. do you think there are any RFEs - I'm assuming for Samplestack - coming out of the non-ideal snippeting?

grechaw commented 9 years ago

Well, I created a bug (31700) to track what's up with max-snippet-chars... it didn't seem to affect the samplestack snippets at all. Otherwise, our snippets look non-ideal because we cannot distinguish between code and prose in our data model -- we didn't use markup to model the Q+A docs, so we can't take advantage of markup for higher-quality snippeting.

So there could be RFEs to either change samplestack, or to do some smarter customized snippets, but mainly what we've done is run up against an inherent disconnect between snippeting OOTB and the data model we chose.

gghai commented 9 years ago

Ran this with @yawitz . Used different search string to see the behavior .

The Link max count seem to be returning 4 now in each of the QnA entry . Looks good.

Search String 1 : "sort" : Result snippet seems closer tot what is expected .

Search String 2 : "marklogic" Look at : 1 st snippet Link http://docs.marklogic.com .Are they expected as coming ?

Search String 3 : "marklogic search string + tag filter "xquery" -> Look at the 2nd entry . There are 3 links in the snippets [A] [Q] [Q] text inside the first link [A] is longer compared to the other two.

gghai commented 9 years ago

@laurelnaiad can you confirm on the Search String 3 : "marklogic search string + tag filter "xquery" -> Look at the 2nd entry . There are 3 links in the snippets [A] [Q] [Q] text inside the first link [A] is longer compared to the other two. Is that expected or not ?

laurelnaiad commented 9 years ago

I can't really understand your comments. Will you please specify a full URL for a search that shows the problem, and a result item number where the problem appears in those resuts so i can understand what the problem is?

gghai commented 9 years ago

You should be able to see via using the search string and tag filter . Let me know if you still having troubles viewing it .

laurelnaiad commented 9 years ago

When i search for marklogic search string + tag filter "xquery" I get 0 documents. The full URL of the search will help get us on the same page.

laurelnaiad commented 9 years ago

Yes, full URL please! :)

gghai commented 9 years ago

http://rh6-intel64-80-qa-dev-2:3000/?q=marklogic&tags=xquery

laurelnaiad commented 9 years ago

Is this the one?

Q: MarkLogic Search API - How to do sort-order on same element as searchable-expression
[A] ...`: <options xmlns="http://marklogic.com/appservices/search"> <term> <term-option>... ...element-query xmlns:cts="http://marklogic.com/cts"> <cts:element>... ...query> <sort-order type="xs:string" direction="ascending"> <element... ..."value"/> <annotation>options for search institutions by name</annotation> <... ... [Q] I am attempting to do a search on a specific element so I... ...the value element. Here are my search options which hopefully makes things clear... ...clear: <options xmlns="http://marklogic.com/appservices/search"> <term> <term-option>... ...expression> <sort-order type="xs:string" direction="ascending"> <element... ..."value"/> <annotation>options for search institutions by name</annotation> <... ...adds another value node (Taken from search:report id="SEARCH-FLWOR") ...order by... FLWOR") ...order by xs:string(($result//value)[1])... ...of: ...order by xs:string(($result)[1]) ascending... ...which I do not wish to search across. I also I can... ... [Q] MarkLogicSearch API - How to do sort...
gghai commented 9 years ago

Q: Writing reusable codes in xquery and starting xquery / marklogic

[A] ...'t have a module clause). MarkLogic does provide an interesting construct,... ...supported by the latest version of MarkLogic) provides similar provisions like dynamic... ...building typical and complete webapps with MarkLogic, there are quite a number... ..., on which http://developer.marklogic.com is based. There is... ...in building REST api's in MarkLogic. In that case MarkLogic 6 has built-in features,... ...all available on http://developer.marklogic.com HTH! ... [Q] I am a starter in Marklogic and Xquery. - - -... ...interface etc) possible in xquery with marklogic. 2. Where to start from... ...'building my Hello World application in MarkLogic / XQuery'? - - -... ... [Q] ...in xquery and starting xquery / marklo

yawitz commented 9 years ago

I think you need to be logged-in to see the same Q as @gghai sees.

laurelnaiad commented 9 years ago

Ok, so it doesn't look like the browser is doing anythinbg to the snippets related to what you're describing. Assigning to @grechaw .

grechaw commented 9 years ago

I verified that the snippets are working as designed -- we just don't have a control right now to limit the number of text matches in one particular snippet match... the first match above that is somewhat long looking would have to be truncated in the browser (or middle tier). So what course of action should we take here?

laurelnaiad commented 9 years ago

we just don't have a control right now to limit the number of text matches in one particular snippet match

Is this why the same text seems to be repeated over and over again with different snippets within it called out (i.e. within one section, the keyword is repeatedly appearing? Is this is how the server returns it? Or is this how the array in constructed in the middle tier?

grechaw commented 9 years ago

The middle tier isn't doing anything to the snippets -- Before the external bug was fixed, there was repetition, but now, you should only see the keyword repeated if there are multiple matches... I've not seen a case where text is reported more than once since the server-side snippet fix. ?

laurelnaiad commented 9 years ago

Ok, maybe I imagined it. I will check again.

grechaw commented 9 years ago

I think this is working as expected -- there was another issue to track highlighting, which is also checked in.

gghai commented 9 years ago

I am marking this as fixed. @popzip @yawitz Please raise a separate issue on snippet related improvement that is required post 8.0-1 . Thanks.

popzip commented 9 years ago

which issue tracks highlighting - @grechaw can you reference it here?

what issue is tracking inability to limit number of text matches in one particular snippet match - is this an external issue? something for next release?

yawitz commented 9 years ago

Moved https://github.com/marklogic/marklogic-samplestack/issues/371#issuecomment-70585614 to #511.