ICIJ / datashare

A self-hosted search engine for documents.
https://datashare.icij.org
GNU Affero General Public License v3.0
598 stars 54 forks source link

Add an option to use Elasticsearch default_operator=AND query syntax #1497

Open PapayaJackal opened 3 months ago

PapayaJackal commented 3 months ago

Is your feature request related to a problem? Please describe.

Hi. I've been using Datashare as an alternative to Aleph for a while now.

One major difference that I found confuses people is that Datashare uses the default Elasticsearch OR default_operator syntax ("word1 word2" = "word1 OR word2") which I personally find less intuitive than the AND syntax ("word1 word2" = "word1 AND word2")

It's the opposite of what people are used to from Aleph and pretty much any other search engine, i.e. Google.

Describe the solution you'd like

I know making AND the default_operator would be a major change, but perhaps it could be added as a configuration option?

Describe alternatives you've considered

Forking the project

pirhoo commented 3 months ago

Hello @PapayaJackal,

It's a very good point indeed.

We used the OR operator by default since almost 10 years with thousands of reporters and I don't think they ever complain about it. However I agree it might be confusing for newcomers. We are currently in the process of redesigning Datashare completely so we will chat about how to integrate this option in the search interface.

PapayaJackal commented 3 months ago

Hi @pirhoo,

Thank you for your response. Is there any documentation available regarding the redesign process? I would also like to know if there’s an opportunity to provide feedback.

For some context, I previously managed an Aleph instance for an NGO, and we encountered significant challenges with indexing, search performance, and data duplication. This led me to start developing my own search engine from scratch. However, we were then introduced to Datashare, which turned out to be closely aligned with what I was planning to build. We now operate a private Datashare instance with 25TB of data, and overall, we’ve been quite satisfied with it.

pirhoo commented 3 months ago

We would be very interested to know more about your usage of Datashare!

This new design is focused on accessibility and responsiveness. We can get in touch at the end of the month with @Soliine to organise a user interview if you'd like.

I can give you a sneak peek of the new design (including the default operator setting):

Screenshot 2024-08-05 at 14 40 32

All new components are progressively being implemented here:

https://icij.github.io/datashare-client/

Soliine commented 3 months ago

It would be great!

Could you share your contact info with us at datashare@icij.org?

I will get in touch at the end of August.

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 40 days with no activity.

github-actions[bot] commented 1 month ago

This issue was closed because it has been inactive for 20 days since being marked as stale.

PapayaJackal commented 1 month ago

I reached out via email regarding the user interview for Datashare that Soline mentioned in August. I apologize for not doing so sooner!