linagora / james-project

Mirror of Apache James Project
Apache License 2.0
72 stars 62 forks source link

Search feature should support searching by word surrounded by underscores #5260

Open hungphan227 opened 2 months ago

hungphan227 commented 2 months ago

Assume that I have an email in the mailbox with title the_lord_of_rings. If I search by word "lord", there will be no results.

It seems that when creating indexes, we do not separate words by underscore. Should improve this.

quantranhong1999 commented 2 months ago

(Feedback from Linagora VN HR team.)

Likely we need to modify the analyzer if we want to go with this.

Arsnael commented 2 months ago

I think it's a valid concern, but quite a change of behavior on the search analyzer yes.

Maybe that could be a good occasion for you @hungphan227 to start a discussion on the mailing list perhaps?

chibenwa commented 2 months ago

Piece-mealed here. I am dubitous about the legimity of this request.

Why write "lord_of_the_ring" if it is asked to work as "lord of the ring"?

I pretty much bet that the next thing poping up is "I search lord_of_chocolate, why it matches lord_of_the_ring".

A lot of burdon for a niche usage. I'd wait to have convergeant feedback to implement it.

quantranhong1999 commented 2 months ago

FYI today we support extracting words out of hyphen -.

lord-of-chocolate => Today we can search lord and likely return lord-of-the-ring also.

But we dont for _.

I am dubitous about the legimity of this request.

The use case was searching for candidate name e.g. from mail with subject CV_Nguyen Van A_2024.

I tested and Gmail supports this use case.

However I am not sure if we should support it.

chibenwa commented 2 months ago

However I am not sure if we should support it.

Let's wait convergent requests ;-)