linagora / james-project

Mirror of Apache James Project
Apache License 2.0
71 stars 63 forks source link

[SearchSnippet] Highlight Opensearch implementation #5253

Open Arsnael opened 3 weeks ago

Arsnael commented 3 weeks ago

Add the highlight search in Opensearch implementation

In OpenSearchSearcher, modify the search function to have a highlight boolean value. If present, add the highlight in the SearchRequest.Builder based on the field text (that's where the highlighted value would be in the search)

In MessageSearchIndex, add an other search method with the highlight boolean and refactor OpenSearchListeningMessageSearchIndex

Add a SearchHighlighter interface in mailbox api with the following method :

Publisher<SearchSnippet> highlightSearch(MultimailboxesSearchQuery expression, MailboxSession session, long limit);

and have an implementation of it in mailbox store (StoreSearchHighlighter ?) that will inject MessageSearchIndex and implement the highlightSearch method, that will call the search with highlight from MessageSearchIndex

The message ids should be in the SearchQuery declared in the MultimailboxesSearchQuery param.

Add proper guice bindings

DoD: Add proper unit tests and integration tests

chibenwa commented 3 weeks ago

In MailboxManager, add an other search method with the highlight boolean and refactor StoreMailboxManager

Not sure about that bit.

I would prefer not adding yet-another responsibility to that class that has already too much.

I would prefer to have a dedicated SearchHighlighter interface part of the mailbox api.

Arsnael commented 3 weeks ago

Actually... might be better. Thinking about it again, we need the emailIds as well, which changes quite the search in itself that never took that in consideration in the first place.

Will give it more thoughts

chibenwa commented 3 weeks ago

Multi-mailbox search takes into account messageIds

Arsnael commented 3 weeks ago

Updated

quantranhong1999 commented 2 weeks ago
Publisher<MessageId> highlightSearch(MultimailboxesSearchQuery expression, MailboxSession session, long limit);

Should we return the Publisher<SearchSnippet> instead? How can we response the following response for example with only messageIds?

"list": [{
      "emailId": "M44200ec123de277c0c1ce69c",
      "subject": null,
      "preview": null
  }, {
      "emailId": "M7bcbcb0b58d7729686e83d99",
      "subject": "The <mark>Foo</mark>sball competition",
      "preview": "...year the <mark>foo</mark>sball competition will
        be held in the Stadium de ..."
  }, {
      "emailId": "M28d12783a0969584b6deaac0",
      "subject": null,
      "preview": "...the <mark>Foo</mark>/bar method results often
        returns &lt;1 widget rather than the complete..."
  }
Arsnael commented 2 weeks ago

@quantranhong1999 bad copy/paste, you are right, thanks!

vttranlina commented 2 weeks ago

If we using directly SearchHighlighter interface directly within the SearchSnippetGetMethod ( instead of using MailboxManager.search ) there should be no need to modify OpenSearchListeningMessageSearchIndex#search and MessageSearchIndex, right?

// as Quan mentioned above, the only MessageId is not enough, It must be to SearchSnippet, hence the implementing this change in SearchSnippet to OpenSearchListeningMessageSearchIndex, MessageSearchIndex is quiet complex

quantranhong1999 commented 2 weeks ago

If we using directly SearchHighlighter interface directly within the EmailQueryMethod

You mean SearchSnippetGetMethod?

vttranlina commented 2 weeks ago

If we using directly SearchHighlighter interface directly within the EmailQueryMethod

You mean SearchSnippetGetMethod?

yes

Arsnael commented 2 weeks ago

If we using directly SearchHighlighter interface directly within the SearchSnippetGetMethod ( instead of using MailboxManager.search ) there should be no need to modify OpenSearchListeningMessageSearchIndex#search and MessageSearchIndex, right?

could do I guess yes

Arsnael commented 2 weeks ago

Need to check mailbox controls to make sure we dont search mailboxes not belonging to user