manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
8.89k stars 493 forks source link

Request for community: JSON data #36

Closed tomatolog closed 4 years ago

tomatolog commented 6 years ago

We going to add more features related to JSON index attributes and need many test cases for that. That is why we ask to give us your big JSON data and how you use it, such as - indexes with JSON attributes, source data, queries, query process time, index sizes.

Fil commented 6 years ago

Here's how we use JSON in the Indexer plugin for SPIP : https://zone.spip.org/trac/spip-zone/browser/_plugins_/indexer/trunk/Sources/SpipDocuments.php#L243

so basically we have one properties JSON field, which we fill with arrays of numbers and strings representing categories (by name and hash), authors, tags…

This allows us to index several unrelated sites in indexes that can be queried together.

Queries are made with a template like this https://zone.spip.org/trac/spip-zone/browser/_plugins_/indexer/trunk/content/sphinx.html#L60

which "compiles" to a SQL query that you can easily guess from the template (say, if there is &lang=xxin the URL, the query gets a SELECT IN(properties.lang, "xx") AS f1 … WHERE f1 = 1).

This plugin is used in several medium-sized applications, such as for example a newspaper's search engine https://www.monde-diplomatique.fr/ (60k articles).

airolg commented 6 years ago

Thank you for help, we'll check it out

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Feel free to re-open the issue in case it becomes actual.