manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
8.93k stars 495 forks source link

Feature request: user defined constraints (unique key) #1015

Open darren-martz opened 1 year ago

darren-martz commented 1 year ago

Ingesting data into manticore based on a changing dataset is challenging with the ID field being the only unique data constraint. Constraints are a powerful tool for identifying conflicts for insert vs update actions. The REPLACE statement already exists for this purpose but only has the ID field to work from.

Adding the ability to identify any field as UNIQUE would allow the REPLACE statement to update a record based on user content.

Example: both mykey and specialnumber have unique value constraints

CREATE TABLE ( mykey TEXT UNIQUE, author TEXT, body TEXT, created TIMESTAMP, specialnumber INT UNIQUE )

Example: unique value constrain on mykey author and a second value constraint on specialnumber

CREATE TABLE ( mykey TEXT, author TEXT, body TEXT, created TIMESTAMP, specialnumber INT UNIQUE, UNIQUE(mykey, author) )

The REPLACE, INSERT, and UPDATE statements would then respect all unique value constraints.

Alternatives/Workarounds

  1. generate a 64-bit integer hash value from content and assign it to the built-in ID field.
  2. drop the entire table and reload

Additional context
As someone exploring to replace Elastic Search, having decent synchronizing tools available can go a long way to adoption.

https://forum.manticoresearch.com/t/contraints-like-unique-keys/1262/3

tomatolog commented 1 year ago

I think the full list of statements are REPLACE, INSERT, UPDATE, DELETE and for update and delete it not usual to use initial text field content as statement key

scruel commented 6 months ago

@tomatolog Could you explain more about your idea? I can't see the connection between what you said and this issue.

sanikolaev commented 6 months ago

@scruel the connection is:

The REPLACE, INSERT, and UPDATE statements would then respect all unique value constraints.

the full list of statements are REPLACE, INSERT, UPDATE, DELETE and for update and delete it not usual to use initial text field content as statement key

henzeb commented 4 days ago

anyone looking for a workaround, look into Cantor or Szudzik Pairing Functions. Basically, the idea is to condense 2 numbers into one single number. 2 and 3 makes 11, but 3 and 2 makes 14. It also allows you to reverse the process, as the order of the numbers matter. It should even be possible to do this with more than 2 numbers.

https://www.vertexfragment.com/ramblings/cantor-szudzik-pairing-functions/