The CRUD
module allows to perform CRUD operations on the cluster.
It also provides the crud-storage
and crud-router
roles for
Tarantool Cartridge.
First, install Tarantool.
To try crud
in your application, you may install it manually fron web
with tt rocks
rock management tool.
tt rocks install crud
To use crud in your application, set it as a rockspec dependency.
package = 'myapp'
version = 'scm-1'
source = {
url = '/dev/null',
}
dependencies = {
'tarantool >= 3.1.0',
'crud == <the-latest-tag>-1',
}
build = {
type = 'none';
}
You can also clone the repository to explore crud and try it inside a sandbox.
git clone https://github.com/tarantool/crud.git
cd crud
tt rocks make
For Tarantool 3.x, enable crud roles on your application instances in a configuration (see Tarantool 3 roles section). Roles support Tarantool 3.0.2, Tarantool 3.1.0 and newer. Older versions are not supported due to tarantool/tarantool#9643 and tarantool/tarantool#9649 issues.
For Tarantool 1.10 and 2.x, add crud roles into dependencies of your roles (see Cartridge roles section).
For Tarantool 1.10, 2.x and 3.x you can also manually call the crud initialization code on VShard router and storage instances.
The repository provide a simple sandbox application with a test dataset on a single instance.
./doc/playground.lua
tarantool> crud.select('customers', {{'<=', 'age', 35}}, {first = 10})
tarantool> crud.select('developers', nil, {first = 6})
The CRUD operations should be called from router.
All VShard storages should call crud.init_storage()
after
vshard.storage.cfg()
(or enable the roles.crud-storage
role for Tarantool 3
or the crud-storage
role for Cartridge)
first to initialize storage-side functions that are used to manipulate data
across the cluster. The storage-side functions have the same access
as a user calling crud.init_storage()
. Therefore, if crud
do not have
enough access to modify some space, then you need to give access to the user.
You can call crud.init_storage{async = true}
to bootstrap procedures grants
asynchronously. It is useful in case your application master instances may
start in ro mode (for example, if you use Tarantool 3.x). By default,
asynchronous bootstrap is used for Tarantool 3.x and
synchronous bootstrap is used for Tarantool 1.10 and 2.x.
All VShard routers should call crud.init_router()
after vshard.router.cfg()
(or enable the roles.crud-storage
role for Tarantool 3
or the crud-router
role for Cartridge) to make crud
functions
callable via net.box
. If a user is allowed to execute crud
functions on
the router-side then the user does not need additional access on storages.
You can check out an example of the configuration for local development (a single instance that combines router and storage) in playground.lua.
All operations return a table that contains rows (tuples) and metadata
(space format).
It can be used to convert received tuples to objects via crud.unflatten_rows
function.
For example:
res, err = crud.select('customers', nil, {first = 2})
res
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [1, 12477, 'Elizabeth', 12]
- [2, 21401, 'David', 33]
...
crud.unflatten_rows(res.rows, res.metadata)
---
- - bucket_id: 12477
age: 12
name: Elizabeth
id: 1
- bucket_id: 21401
age: 33
name: David
id: 2
...
Notes:
Sharding key and bucket id calculation
Sharding key is a set of tuple field values used for calculation bucket ID. Sharding key definition is a set of tuple field names that describe what tuple field should be a part of sharding key. Bucket ID determines which replicaset stores certain data. Function that used for bucket ID calculation is named sharding function.
By default CRUD calculates bucket ID using primary key and a function
vshard.router.bucket_id_strcrc32(key)
, it happen automatically and doesn't
require any actions from user side. However, for operations that accepts
tuple/object bucket ID can be specified as tuple/object field as well as
opts.bucket_id
value.
Starting from 0.10.0 users who don't want to use primary key as a sharding key
may set custom sharding key definition as a part of DDL
schema or insert manually
to the space _ddl_sharding_key
(for both cases consider a DDL module
documentation). As soon as sharding key for a certain space is available in
_ddl_sharding_key
space CRUD will use it for bucket ID calculation
automatically. Note that CRUD methods delete()
, get()
and update()
requires that sharding key must be a part of primary key.
Starting from 0.11.0 you can specify sharding function to calculate bucket_id
with sharding func definition as a part of
DDL schema
or insert manually to the space _ddl_sharding_func
.
Automatic sharding key and function reload is supported since version 0.11.0. Version 0.11.0 contains critical bug that causes some CRUD methods to fail with "Sharding hash mismatch" error if ddl is set and bucket_id is provided explicitly (#278). Please, upgrade to 0.11.1 instead.
CRUD uses strcrc32
as sharding function by default.
The reason why using of strcrc32
is undesirable is that
this sharding function is not consistent for cdata numbers.
In particular, it returns 3 different values for normal Lua
numbers like 123, for unsigned long long
cdata
(like 123ULL
, or ffi.cast('unsigned long long', 123)
), and for signed long long
cdata (like 123LL
, or
ffi.cast('long long', 123)
).
We cannot change default sharding function strcrc32
due to backward compatibility concerns, but please consider
using better alternatives for sharding function.
mpcrc32
is one of them.
Table below describe what operations supports custom sharding key:
CRUD method | Sharding key support |
---|---|
get() |
Yes |
insert() / insert_object() |
Yes |
delete() |
Yes |
replace() / replace_object() |
Yes |
upsert() / upsert_object() |
Yes |
select() / pairs() |
Yes |
count() |
Yes |
update() |
Yes |
min() / max() |
No (not required) |
cut_rows() / cut_objects() |
No (not required) |
truncate() |
No (not required) |
len() |
No (not required) |
Current limitations for using custom sharding key:
primary_index_fieldno_map
is not cached, see
#243.tarantool> require('crud')._VERSION
---
- 1.1.0
...
Use _VERSION
handle to check installed module version.
The handle was introduced in 1.1.0
. If installed from master,
_VERSION
shows last tagged version.
-- Insert tuple
local result, err = crud.insert(space_name, tuple, opts)
-- Insert object
local result, err = crud.insert_object(space_name, object, opts)
where:
space_name
(string
) - name of the space to insert an objecttuple
/ object
(table
) - tuple/object to insertopts
:
timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2bucket_id
(?number|cdata
) - bucket IDfields
(?table
) - field names for getting only a subset of fieldsvshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusterskip_nullability_check_on_flatten
(?boolean
) - option for
insert_object
only. false
by default. Set this parameter to
true
if you want to allow setting null values to non-nullable
fields, which can be useful if non-nullable field value is generated by
sequence.
Warning: there is no native support for sequences in sharded systems
since each replicaset has its own sequence. If sequence field is a part
of the sharding key (which is true by default), choosing the bucket id is
the sole responsibility of the developernoreturn
(?boolean
) - suppress successfully processed tuple
(first return value is nil
). false
by defaultfetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by defaultReturns metadata and array contains one inserted row, error.
Example:
crud.insert('customers', {1, box.NULL, 'Elizabeth', 23})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [1, 477, 'Elizabeth', 23]
...
crud.insert_object('customers', {
id = 2, name = 'Elizabeth', age = 24,
})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [2, 401, 'Elizabeth', 24]
...
-- Insert batch of tuples
local result, err = crud.insert_many(space_name, tuples, opts)
-- Insert batch of objects
local result, err = crud.insert_object_many(space_name, objects, opts)
where:
space_name
(string
) - name of the space to insert an objecttuples
/ objects
(table
) - array of tuples/objects to insert (at least one)opts
:
timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2fields
(?table
) - field names for getting only a subset of fieldsstop_on_error
(?boolean
) - stop on a first error and report error
regarding the failed operation and error about what tuples were not
performed, default is false
rollback_on_error
(?boolean
) - any failed operation will lead to
rollback on a storage, where the operation is failed, report error
about what tuples were rollback, default is false
vshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusterskip_nullability_check_on_flatten
(?boolean
) - option for
insert_object_many
only. false
by default. Set this parameter to
true
if you want to allow setting null values to non-nullable
fields, which can be useful if non-nullable field value is generated by
sequence.
Warning: there is no native support for sequences in sharded systems
since each replicaset has its own sequence. If sequence field is a part
of the sharding key (which is true by default), choosing the bucket id is
the sole responsibility of the developernoreturn
(?boolean
) - suppress successfully processed tuples
(first return value is nil
). false
by defaultfetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by defaultReturns metadata and array with inserted rows, array of errors.
Each error object can contain field operation_data
.
operation_data
field can contain:
Right now CRUD cannot provide batch insert with full consistency.
CRUD offers batch insert with partial consistency. That means
that full consistency can be provided only on single replicaset
using box
transactions.
Example:
crud.insert_many('customers', {
{1, box.NULL, 'Elizabeth', 23},
{2, box.NULL, 'Anastasia', 22},
})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [1, 477, 'Elizabeth', 23]
- [2, 401, 'Anastasia', 22]
...
crud.insert_object_many('customers', {
{id = 3, name = 'Elizabeth', age = 24},
{id = 10, name = 'Anastasia', age = 21},
})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [3, 2804, 'Elizabeth', 24]
- [10, 569, 'Anastasia', 21]
-- Partial success
local res, errs = crud.insert_object_many('customers', {
{id = 22, name = 'Alex', age = 34},
{id = 3, name = 'Anastasia', age = 22},
{id = 5, name = 'Sergey', age = 25},
})
---
res
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [5, 1172, 'Sergey', 25],
- [22, 655, 'Alex', 34],
#errs -- 1
errs[1].class_name -- BatchInsertError
errs[1].err -- 'Duplicate key exists <...>'
errs[1].operation_data -- {3, 2804, 'Anastasia', 22}
...
-- Partial success with stop and rollback on error
-- stop_on_error = true, rollback_on_error = true
-- two error on one storage with rollback, inserts
-- stop by error on this storage inserts before
-- error are rollback
local res, errs = crud.insert_object_many('customers', {
{id = 6, name = 'Alex', age = 34},
{id = 92, name = 'Artur', age = 29},
{id = 3, name = 'Anastasia', age = 22},
{id = 4, name = 'Sergey', age = 25},
{id = 9, name = 'Anna', age = 30},
{id = 71, name = 'Oksana', age = 29},
}, {
stop_on_error = true,
rollback_on_error = true,
})
---
res
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [4, 1161, 'Sergey', 25],
- [6, 1064, 'Alex', 34],
#errs -- 4
errs[1].class_name -- InsertManyError
errs[1].err -- 'Duplicate key exists <...>'
errs[1].operation_data -- {3, 2804, 'Anastasia', 22}
errs[2].class_name -- NotPerformedError
errs[2].err -- 'Operation with tuple was not performed'
errs[2].operation_data -- {9, 1644, "Anna", 30}
errs[3].class_name -- NotPerformedError
errs[3].err -- 'Operation with tuple was not performed'
errs[3].operation_data -- {71, 1802, "Oksana", 29}
errs[4].class_name -- NotPerformedError
errs[4].err -- 'Operation with tuple was rollback'
errs[4].operation_data -- {92, 2040, "Artur", 29}
local result, err = crud.get(space_name, key, opts)
where:
space_name
(string
) - name of the spacekey
(any
) - primary key valueopts
:
fields
(?table
) - field names for getting only a subset of fieldsbucket_id
(?number|cdata
) - bucket IDtimeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2mode
(?string
, read
or write
) - if write
is specified then get
is
performed on master, default value is read
prefer_replica
(?boolean
) - if true
then the preferred target is one of
the replicasbalance
(?boolean
) - use replica according to vshard load balancing policyvshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusterfetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by defaultReturns metadata and array contains one row, error.
Example:
crud.get('customers', 1)
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [1, 477, 'Elizabeth', 23]
...
local result, err = crud.update(space_name, key, operations, opts)
where:
space_name
(string
) - name of the spacekey
(any
) - primary key valueoperations
(table
) - update operationsopts
:
timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2bucket_id
(?number|cdata
) - bucket IDfields
(?table
) - field names for getting only a subset of fieldsvshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusternoreturn
(?boolean
) - suppress successfully processed tuple
(first return value is nil
). false
by defaultfetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by defaultReturns metadata and array contains one updated row, error.
Example:
crud.update('customers', 1, {{'+', 'age', 1}})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [1, 477, 'Elizabeth', 24]
...
local result, err = crud.delete(space_name, key, opts)
where:
space_name
(string
) - name of the spacekey
(any
) - primary key valueopts
:
timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2bucket_id
(?number|cdata
) - bucket IDfields
(?table
) - field names for getting only a subset of fieldsvshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusternoreturn
(?boolean
) - suppress successfully processed tuple
(first return value is nil
). false
by defaultfetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by defaultReturns metadata and array contains one deleted row (empty for vinyl), error.
Example:
crud.delete('customers', 1)
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [1, 477, 'Elizabeth', 24]
-- Replace tuple
local result, err = crud.replace(space_name, tuple, opts)
-- Replace object
local result, err = crud.replace_object(space_name, object, opts)
where:
space_name
(string
) - name of the spacetuple
/ object
(table
) - tuple/object to insert or replace exist oneopts
:
timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2bucket_id
(?number|cdata
) - bucket IDfields
(?table
) - field names for getting only a subset of fieldsvshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusterskip_nullability_check_on_flatten
(?boolean
) - option for
replace_object
only. false
by default. Set this parameter to
true
if you want to allow setting null values to non-nullable
fields, which can be useful if non-nullable field value is generated by
sequence.
Warning: there is no native support for sequences in sharded systems
since each replicaset has its own sequence. If sequence field is a part
of the sharding key (which is true by default), choosing the bucket id is
the sole responsibility of the developernoreturn
(?boolean
) - suppress successfully processed tuple
(first return value is nil
). false
by defaultfetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by defaultReturns inserted or replaced rows and metadata or nil with error.
Example:
crud.replace('customers', {1, box.NULL, 'Alice', 22})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [1, 477, 'Alice', 22]
...
crud.replace_object('customers', {
id = 1, name = 'Alice', age = 22,
})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [1, 477, 'Alice', 22]
...
-- Replace batch of tuples
local result, err = crud.replace_many(space_name, tuples, opts)
-- Replace batch of objects
local result, err = crud.replace_object_many(space_name, objects, opts)
where:
space_name
(string
) - name of the space to insert/replace an objecttuples
/ objects
(table
) - array of tuples/objects to replace (at least one)opts
:
timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2fields
(?table
) - field names for getting only a subset of fieldsstop_on_error
(?boolean
) - stop on a first error and report error
regarding the failed operation and error about what tuples were not
performed, default is false
rollback_on_error
(?boolean
) - any failed operation will lead to
rollback on a storage, where the operation is failed, report error
about what tuples were rollback, default is false
vshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusterskip_nullability_check_on_flatten
(?boolean
) - option for
replace_object_many
only. false
by default. Set this parameter to
true
if you want to allow setting null values to non-nullable
fields, which can be useful if non-nullable field value is generated by
sequence.
Warning: there is no native support for sequences in sharded systems
since each replicaset has its own sequence. If sequence field is a part
of the sharding key (which is true by default), choosing the bucket id is
the sole responsibility of the developernoreturn
(?boolean
) - suppress successfully processed tuples
(first return value is nil
). false
by defaultfetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by defaultReturns metadata and array with inserted/replaced rows, array of errors.
Each error object can contain field operation_data
.
operation_data
field can contain:
Right now CRUD cannot provide batch replace with full consistency.
CRUD offers batch replace with partial consistency. That means
that full consistency can be provided only on single replicaset
using box
transactions.
Example:
crud.replace_many('developers', {
{1, box.NULL, 'Elizabeth', 'lizaaa'},
{2, box.NULL, 'Anastasia', 'iamnewdeveloper'},
})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'login', 'type': 'string'}
rows:
- [1, 477, 'Elizabeth', 'lizaaa']
- [2, 401, 'Anastasia', 'iamnewdeveloper']
...
crud.replace_object_many('developers', {
{id = 1, name = 'Inga', login = 'mylogin'},
{id = 10, name = 'Anastasia', login = 'qwerty'},
})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [1, 477, 'Inga', 'mylogin']
- [10, 569, 'Anastasia', 'qwerty']
-- Partial success
-- Let's say login has unique secondary index
local res, errs = crud.replace_object_many('developers', {
{id = 22, name = 'Alex', login = 'pushkinn'},
{id = 3, name = 'Anastasia', login = 'qwerty'},
{id = 5, name = 'Sergey', login = 's.petrenko'},
})
---
res
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [5, 1172, 'Sergey', 's.petrenko'],
- [22, 655, 'Alex', 'pushkinn'],
#errs -- 1
errs[1].class_name -- ReplaceManyError
errs[1].err -- 'Duplicate key exists <...>'
errs[1].operation_data -- {3, 2804, 'Anastasia', 'qwerty'}
-- Partial success with stop and rollback on error
-- stop_on_error = true, rollback_on_error = true
-- two error on one storage with rollback, inserts stop by error on this storage
-- inserts before error are rollback
local res, crud.replace_object_many('developers', {
{id = 6, name = 'Alex', login = 'alexpushkin'},
{id = 92, name = 'Artur', login = 'AGolden'},
{id = 11, name = 'Anastasia', login = 'qwerty'},
{id = 4, name = 'Sergey', login = 's.smirnov'},
{id = 9, name = 'Anna', login = 'AnnaBlack'},
{id = 17, name = 'Oksana', login = 'OKonov'},
}, {
stop_on_error = true,
rollback_on_error = true,
})
res
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [4, 1161, 'Sergey', 's.smirnov'],
- [6, 1064, 'Alex', 'alexpushkin'],
#errs -- 4
errs[1].class_name -- ReplaceManyError
errs[1].err -- 'Duplicate key exists <...>'
errs[1].operation_data -- {11, 2652, "Anastasia", "qwerty"}
errs[2].class_name -- NotPerformedError
errs[2].err -- 'Operation with tuple was not performed'
errs[2].operation_data -- {9, 1644, "Anna", "AnnaBlack"}
errs[3].class_name -- NotPerformedError
errs[3].err -- 'Operation with tuple was not performed'
errs[3].operation_data -- {17, 2900, "Oksana", "OKonov"}
errs[4].class_name -- NotPerformedError
errs[4].err -- 'Operation with tuple was rollback'
errs[4].operation_data -- {92, 2040, "Artur", "AGolden"}
...
-- Upsert tuple
local result, err = crud.upsert(space_name, tuple, operations, opts)
-- Upsert object
local result, err = crud.upsert_object(space_name, tuple, operations, opts)
where:
space_name
(string
) - name of the spacetuple
/ object
(table
) - tuple/object to insert if there is no existing tuple which matches the key fieldsoperations
(table
) - update operations if there is an existing tuple which matches the key fields of tupleopts
:
timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2bucket_id
(?number|cdata
) - bucket IDfields
(?table
) - field names for getting only a subset of fieldsvshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusternoreturn
(?boolean
) - suppress successfully processed tuple
(first return value is nil
). false
by defaultfetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by defaultReturns metadata and empty array of rows or nil, error.
Example:
crud.upsert('customers',
{1, box.NULL, 'Alice', 22},
{{'+', 'age', 1}})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows: []
...
crud.upsert_object('customers',
{id = 1, name = 'Alice', age = 22},
{{'+', 'age', 1}})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows: []
...
-- Upsert batch of tuples
local result, err = crud.upsert_many(space_name, tuples_operation_data, opts)
-- Upsert batch of objects
local result, err = crud.upsert_object_many(space_name, objects_operation_data, opts)
where:
space_name
(string
) - name of the space to insert an objecttuples_operation_data
/ objects_operation_data
(table
) - array of
tuples/objects to insert
and update operations
in format {{tuple_1, operation_1}, ..., {tuple_n, operation_n}} (at least one),
if there is tuple with duplicate key then existing tuple will
be updated with update operationsopts
:
timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2fields
(?table
) - field names for getting only a subset of fieldsstop_on_error
(?boolean
) - stop on a first error and report error
regarding the failed operation and error about what tuples were not
performed, default is false
rollback_on_error
(?boolean
) - any failed operation will lead to
rollback on a storage, where the operation is failed, report error
about what tuples were rollback, default is false
vshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusternoreturn
(?boolean
) - suppress successfully processed tuples
(first return value is nil
). false
by defaultfetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by defaultReturns metadata and array of errors.
Each error object can contain field operation_data
.
operation_data
field can contain:
Right now CRUD cannot provide batch upsert with full consistency.
CRUD offers batch upsert with partial consistency. That means
that full consistency can be provided only on single replicaset
using box
transactions.
Example:
crud.upsert_many('customers', {
{{1, box.NULL, 'Elizabeth', 23}, {{'+', 'age', 1}}},
{{2, box.NULL, 'Anastasia', 22}, {{'+', 'age', 2}, {'=', 'name', 'Oleg'}}}
})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
...
crud.upsert_object_many('customers', {
{{id = 3, name = 'Elizabeth', age = 24}, {{'+', 'age', 1}}},
{{id = 10, name = 'Anastasia', age = 21}, {{'+', 'age', 2}}}
})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
-- Partial success
local res, errs = crud.upsert_object_many('customers', {
{{id = 22, name = 'Alex', age = 34}, {{'+', 'age', 12}}},
{{id = 3, name = 'Anastasia', age = 22}, {{'=', 'age', 'invalid type'}}},
{{id = 5, name = 'Sergey', age = 25}, {{'+', 'age', 10}}}
})
---
res
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
#errs -- 1
errs[1].class_name -- BatchUpsertError
errs[1].err -- 'Tuple field 4 (age) type does not match one required by operation <...>'
errs[1].operation_data -- {3, 2804, 'Anastasia', 22}
...
-- Partial success success with stop and rollback on error
-- stop_on_error = true, rollback_on_error = true
-- two error on one storage with rollback,
-- inserts stop by error on this storage
-- inserts before error are rollback
local res, errs = crud.upsert_object_many('customers', {
{{id = 6, name = 'Alex', age = 34}, {{'+', 'age', 1}}},
{{id = 92, name = 'Artur', age = 29}, {{'+', 'age', 2}}},
{{id = 3, name = 'Anastasia', age = 22}, {{'+', 'age', '3'}}},
{{id = 4, name = 'Sergey', age = 25}, {{'+', 'age', 4}}},
{{id = 9, name = 'Anna', age = 30}, {{'+', 'age', 5}}},
{{id = 71, name = 'Oksana', age = 29}, {{'+', 'age', '6'}}},
}, {
stop_on_error = true,
rollback_on_error = true,
})
res
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
#errs -- 4
errs[1].class_name -- UpsertManyError
errs[1].err -- 'Duplicate key exists <...>'
errs[1].operation_data -- {3, 2804, 'Anastasia', 22}
errs[2].class_name -- NotPerformedError
errs[2].err -- 'Operation with tuple was not performed'
errs[2].operation_data -- {9, 1644, "Anna", 30}
errs[3].class_name -- NotPerformedError
errs[3].err -- 'Operation with tuple was not performed'
errs[3].operation_data -- {71, 1802, "Oksana", 29}
errs[4].class_name -- NotPerformedError
errs[4].err -- 'Operation with tuple was rollback'
errs[4].operation_data -- {92, 2040, "Artur", 29}
CRUD
supports multi-conditional selects, treating a cluster as a single space.
The conditions may include field names, as well as index names.
(Refer to #352 for field number.)
The recommended first condition is a TREE index; this helps reducing the number
of tuples to scan. Otherwise a full scan is performed.
local result, err = crud.select(space_name, conditions, opts)
where:
space_name
(string
) - name of the spaceconditions
(?table
) - array of select conditionsopts
:
first
(?number
) - the maximum count of the objects to return.
If negative value is specified, the objects behind after
are returned
(after
option is required in this case). See pagination examples.after
(?table
) - tuple after which objects should be selectedbatch_size
(?number
) - number of tuples to process per one request to storagebucket_id
(?number|cdata
) - bucket IDforce_map_call
(?boolean
) - if true
then the map call is performed without any optimizations even
if full primary key equal condition is specifiedtimeout
(?number
) - vshard.call
timeout (in seconds)fields
(?table
) - field names for getting only a subset of fieldsfullscan
(?boolean
) - if true
then a critical log entry will be skipped
on potentially long select
, see avoiding full scan.mode
(?string
, read
or write
) - if write
is specified then select
is
performed on master, default value is read
prefer_replica
(?boolean
) - if true
then the preferred target is one of
the replicasbalance
(?boolean
) - use replica according to vshard load balancing policyvshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusteryield_every
(?number
) - number of tuples processed on storage to yield after,
yield_every
should be > 0, default value is 1000fetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by defaultReturns metadata and array of rows, error.
Select conditions are very similar to Tarantool update operations.
Each condition is a table {operator, field-identifier, value}
:
=
(or ==
), >
, >=
, <
, <=
.Example:
crud.select('customers', {{'<=', 'age', 35}}, {first = 10})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [5, 1172, 'Jack', 35]
- [3, 2804, 'David', 33]
- [6, 1064, 'William', 25]
- [7, 693, 'Elizabeth', 18]
- [1, 477, 'Elizabeth', 12]
...
Note: tuples are sorted by age because space has index age
.
Otherwise, tuples are sorted by primary key.
See more examples of select queries here.
You can iterate across a distributed space using the crud.pairs
function.
Its arguments are the same as crud.select
arguments except
fullscan
(it does not exist because crud.pairs
does not generate a critical
log entry on potentially long requests) and negative first
values aren't
allowed.
User could pass use_tomap flag (false by default) to iterate over flat tuples or objects.
Example:
local tuples = {}
for _, tuple in crud.pairs('customers', {{'<=', 'age', 35}}, {use_tomap = false}) do
-- {5, 1172, 'Jack', 35}
table.insert(tuples, tuple)
end
local objects = {}
for _, object in crud.pairs('customers', {{'<=', 'age', 35}}, {use_tomap = true}) do
-- {id = 5, name = 'Jack', bucket_id = 1172, age = 35}
table.insert(objects, object)
end
See more examples of pairs queries here.
CRUD
supports operations to get the minimum (maximum) object from the space index
local result, err = crud.min(space_name, index_id, opts)
local result, err = crud.max(space_name, index_id, opts)
where:
space_name
(string
) - name of the spaceindex_id
(?string|number
) - index name or index id. Primary index by defaultopts
:
timeout
(?number
) - vshard.call
timeout (in seconds)fields
(?table
) - field names for getting only a subset of fieldsmode
(?string
, read
or write
) - if write
is specified then select
is
performed on master, default value is read
vshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusterfetch_latest_metadata
(?boolean
) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. false
by default-- Find the minimum value in the specified index
local result, err = crud.min('customers', 'age')
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [1, 477, 'Elizabeth', 12]
-- Find the maximum value in the specified index
local result, err = crud.max('customers', 'age')
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [5, 1172, 'Jack', 35]
You could use crud.cut_rows
function to cut off scan key and primary key values that were merged to the select/pairs partial result (select/pairs with fields
option).
local res, err = crud.cut_rows(rows, metadata, fields)
where:
rows
(table
) - array of tuples for cuttingmetadata
(?table
) - metadata about rows
fieldsfields
(table
) - field names of fields that should be contained in the resultReturns metadata and array of rows, error.
See more examples of crud.cut_rows
usage here and here.
If you use pairs
with use_tomap
flag and you need to cut off scan key and primary key values that were merged to the pairs partial result (pairs with fields
option) you should use crud.cut_objects
.
local new_objects = crud.cut_objects(objects, fields)
where:
objects
(table
) - array of objects for cuttingfields
(table
) - field names of fields that should be contained in the resultReturns array of objects.
See more examples of crud.cut_objects
usage here.
-- Truncate space
local result, err = crud.truncate(space_name, opts)
where:
space_name
(string
) - name of the spaceopts
:
timeout
(?number
) - vshard.call
timeout (in seconds)vshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusterReturns true or nil with error.
Example:
#crud.select('customers', {{'<=', 'age', 35}}, {first = 10})
---
- 1
...
crud.truncate('customers', {timeout = 2})
---
- true
...
#crud.select('customers', {{'<=', 'age', 35}}, {first = 10})
---
- 0
...
-- Calculates the number of tuples in the space for memtx engine
-- Calculates the maximum approximate number of tuples in the space for vinyl engine
local result, err = crud.len(space_name, opts)
where:
space_name
(string
) - name of the spaceopts
:
timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2vshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clusterReturns number or nil with error.
Using space id instead of space name is also possible, but deprecated and will be removed in future releases.
Using space id in crud.len and custom vshard_router is not supported by statistics: space labels may be inconsistent.
Example:
Using memtx
:
#crud.select('customers', nil, {fullscan = true})
---
- 5
...
crud.len('customers', {timeout = 2})
---
- 5
...
Using vinyl
:
crud.len('customers')
---
- 0
...
crud.delete('customers', 1)
---
...
crud.len('customers')
---
- 1
...
-- Get storages status
local result, err = crud.storage_info(opts)
where:
opts
:
timeout
(?number
) - maximum time (in seconds, default: 2) to wait for response from
cluster instances.vshard_router
(?string|table
) - Cartridge vshard group name or vshard router instance.Returns storages status table by instance UUID or nil with error. Status table fields:
status
contains a string representing the status:
"running"
- storage is initialized and running."uninitialized"
- storage is not initialized or disabled."error"
- error getting the status from a storage. Connection error, for example.is_master
is true
if an instance is a master, false
- otherwise.message
is nil
unless a problem occurs with getting storage status.Example:
crud.storage_info()
---
- fe1b5bd9-42d4-4955-816c-3aa015e0eb81:
status: running
is_master: true
a1eefe51-9869-4c4c-9676-76431b08c97a:
status: running
is_master: true
777415f4-d656-440e-8834-7124b7267b6d:
status: uninitialized
is_master: false
e1b2e202-b0f7-49cd-b0a2-6b3a584f995e:
status: error
message: 'connect, called on fd 36, aka 127.0.0.1:49762: Connection refused'
is_master: false
...
CRUD
supports multi-conditional count, treating a cluster as a single space.
The same as with select()
the conditions may include field names or numbers,
as well as index names. The recommended first condition is a TREE index; this
helps to reduce the number of tuples to scan. Otherwise a full scan is performed.
If compared with len()
, count()
method scans the entire space to count the
tuples according user conditions. This method does yield that's why result may
be approximate. Number of tuples before next yield()
is under control with
option yield_every
.
local result, err = crud.count(space_name, conditions, opts)
where:
space_name
(string
) - name of the spaceconditions
(?table
) - array of conditionsopts
:
yield_every
(?number
) - number of tuples processed to yield after,
yield_every
should be > 0, default value is 1000timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2bucket_id
(?number|cdata
) - bucket IDforce_map_call
(?boolean
) - if true
then the map call is performed without any optimizations even,
default value is false
fullscan
(?boolean
) - if true
then a critical log entry will be skipped
on potentially long count
, see avoiding full scan.mode
(?string
, read
or write
) - if write
is specified then count
is
performed on master, default value is read
prefer_replica
(?boolean
) - if true
then the preferred target is one of
the replicas, default value is false
balance
(?boolean
) - use replica according to
vshard load balancing policy,
default value is false
vshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clustercrud.count('customers', {{'==', 'age', 35}})
---
- 1
...
Combinations of mode
, prefer_replica
and balance
options lead to:
mode
== write
- method performed on master with vshard call callrw
mode
== read
callro
callbro
callre
callbre
crud
routers can provide statistics on called operations.
-- Enable statistics collect.
crud.cfg{ stats = true }
-- Returns table with statistics information.
crud.stats()
-- Returns table with statistics information for specific space.
crud.stats('my_space')
-- Disables statistics collect and destroys all collectors.
crud.cfg{ stats = false }
-- Destroys all statistics collectors and creates them again.
crud.reset_stats()
If metrics
0.10.0
or greater
found, metrics collectors will be used by default to store statistics
instead of local collectors. Quantiles in metrics summary collections
are disabled by default. You can manually choose driver and enable quantiles.
-- Use simple local collectors (default if no required metrics version found).
crud.cfg{ stats = true, stats_driver = 'local' }
-- Use metrics collectors (default if metrics rock found).
crud.cfg{ stats = true, stats_driver = 'metrics' }
-- Use metrics collectors with 0.99 quantiles.
crud.cfg{ stats = true, stats_driver = 'metrics', stats_quantiles = true }
You can use crud.cfg
to check current stats state.
crud.cfg
---
- stats_quantiles: true
stats: true
stats_driver: metrics
...
Performance overhead is 3-10% in case of local
driver and
5-15% in case of metrics
driver, up to 20% for metrics
with quantiles.
Beware that iterating through crud.cfg
with pairs is not supported yet,
refer to tarantool/crud#265.
Format is as follows.
crud.stats()
---
- spaces:
my_space:
insert:
ok:
latency: 0.0015
latency_average: 0.002
latency_quantile_recent: 0.0015
count: 19800
time: 39.6
error:
latency: 0.0000008
latency_average: 0.000001
latency_quantile_recent: 0.0000008
count: 4
time: 0.000004
...
crud.stats('my_space')
---
- insert:
ok:
latency: 0.0015
latency_average: 0.002
latency_quantile_recent: 0.0015
count: 19800
time: 39.6
error:
latency: 0.0000008
latency_average: 0.000001
latency_quantile_recent: 0.0000008
count: 4
time: 0.000004
...
spaces
section contains statistics for each observed space.
If operation has never been called for a space, the corresponding
field will be empty. If no requests has been called for a
space, it will not be represented. Space data is based on
client requests rather than storages schema, so requests
for non-existing spaces are also collected.
Possible statistics operation labels are
insert
(for insert
and insert_object
calls),
get
, replace
(for replace
and replace_object
calls), update
,
upsert
(for upsert
and upsert_object
calls), delete
,
select
(for select
and pairs
calls), truncate
, len
, count
and borders
(for min
and max
calls).
Each operation section consists of different collectors
for success calls and error (both error throw and nil, err
)
returns. count
is the total requests count since instance start
or stats restart. time
is the total time of requests execution.
latency_average
is time
/ count
.
latency_quantile_recent
is the 0.99 quantile of request execution
time for a recent period (see
metrics
summary API).
It is computed only if metrics
driver is used and quantiles are
enabled. latency_quantile_recent
value may be -nan
if there
wasn't any observations for several ages, see
tarantool/metrics#303.
latency
is a latency_quantile_recent
if metrics
driver is used
and quantiles are enabled, otherwise it's latency_average
.
In metrics
registry statistics are stored as tnt_crud_stats
metrics
with operation
, status
and name
labels.
metrics:collect()
---
- - label_pairs:
status: ok
operation: insert
name: customers
value: 221411
metric_name: tnt_crud_stats_count
- label_pairs:
status: ok
operation: insert
name: customers
value: 10.49834896344692
metric_name: tnt_crud_stats_sum
- label_pairs:
status: ok
operation: insert
name: customers
quantile: 0.99
value: 0.00023606420935973
metric_name: tnt_crud_stats
...
If you see -Inf
value in quantile metrics, try to decrease the tolerated error:
crud.cfg{stats_quantile_tolerated_error = 1e-4}
See tarantool/metrics#189 for
details about the issue.
You can also configure quantile age_bucket_count
(default: 2) and
max_age_time
(in seconds, default: 60):
crud.cfg{
stats_quantile_age_bucket_count = 3,
stats_quantile_max_age_time = 30,
}
See metrics
summary API
for details. These parameters can be used to smooth time window move
or reduce the amount on -nan
gaps for low request frequency applications.
select
section additionally contains details
collectors.
crud.stats('my_space').select.details
---
- map_reduces: 4
tuples_fetched: 10500
tuples_lookup: 238000
...
map_reduces
is the count of planned map reduces (including those not
executed successfully). tuples_fetched
is the count of tuples fetched
from storages during execution, tuples_lookup
is the count of tuples
looked up on storages while collecting responses for calls (including
scrolls for multibatch requests). Details data is updated as part of
the request process, so you may get new details before select
/pairs
call is finished and observed with count, latency and time collectors.
In metrics
registry they are stored as tnt_crud_map_reduces
,
tnt_crud_tuples_fetched
and tnt_crud_tuples_lookup
metrics
with { operation = 'select', name = space_name }
labels.
Since pairs
request behavior differs from any other crud request, its
statistics collection also has specific behavior. Statistics (select
section) are updated after pairs
cycle is finished: you
either have iterated through all records or an error was thrown.
If your pairs cycle was interrupted with break
, statistics will
be collected when pairs objects are cleaned up with Lua garbage
collector.
Statistics are preserved between package reloads. Statistics are preserved
between Tarantool Cartridge role reloads
if you use CRUD Cartridge roles. Beware that metrics 0.12.0 and below do not
support preserving stats between role reload
(see tarantool/metrics#334),
thus this feature will be unsupported for metrics
driver.
A read view is an in-memory snapshot of data on instance that isn’t affected by future data modifications. Read views allow you to retrieve data using the read_view_object:select()
and read_view_object:pairs()
operations.
Read views can be used to make complex analytical queries. This reduces the load on the main database and improves RPS for a single Tarantool instance.
Read views have the following limitations:
To create a read view, call the crud.readview()
function.
local rv = crud.readview(opts)
where:
opts
:
name
(?string
) - name of the read viewtimeout
(?number
) - vshard.call
timeout (in seconds)Example:
local rv = crud.readview({name = 'foo', timeout = 3})
When a read view is no longer needed, close it using the read_view_object:close()
method because a read view may consume a substantial amount of memory.
local rv = crud.readview()
rv:close(opts)
where:
opts
:
timeout
(?number
) - vshard.call
timeout (in seconds)A read view is also closed implicitly when the read view object is collected by the Lua garbage collector.
Example:
local rv = crud.readview()
rv:close({timeout = 3})
read_view_object:select()
supports multi-conditional selects, treating a cluster as a single space, same as crud.select
.
local rv = crud.readview()
local result, err = rv:select(space_name, conditions, opts)
rv:close()
Opts are the same as select opts, except balance
, prefer_replica
and mode
are not supported.
Returns metadata and array of rows, error.
Example:
local rv = crud.readview()
rv:select('customers', nil, {batch_size=1, fullscan=true})
---
- metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
rows:
- [1, 477, 'Elizabeth', 12]
- [2, 401, 'Mary', 46]
- [3, 2804, 'David', 33]
- [4, 1161, 'William', 81]
- [5, 1172, 'Jack', 35]
- [6, 1064, 'William', 25]
- [7, 693, 'Elizabeth', 18]
- null
...
crud.insert('customers', {8, box.NULL, 'Elizabeth', 23})
---
- rows:
- [8, 185, 'Elizabeth', 23]
metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
- null
...
rv:select('customers', nil, {batch_size=1, fullscan=true})
---
- metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
rows:
- [1, 477, 'Elizabeth', 12]
- [2, 401, 'Mary', 46]
- [3, 2804, 'David', 33]
- [4, 1161, 'William', 81]
- [5, 1172, 'Jack', 35]
- [6, 1064, 'William', 25]
- [7, 693, 'Elizabeth', 18]
- null
...
rv:close()
Select conditions for read_view_object:select()
are the same as select conditions for crud.select
.
Example:
rv = crud.readview()
rv:select('customers', {{'<=', 'age', 35}}, {first = 10})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [5, 1172, 'Jack', 35]
- [3, 2804, 'David', 33]
- [6, 1064, 'William', 25]
- [7, 693, 'Elizabeth', 18]
- [1, 477, 'Elizabeth', 12]
...
rv.close()
You can iterate across a distributed space using the read_view_object:pairs()
method.
Its arguments are the same as crud.readview.select
arguments except
fullscan
(it does not exist because crud.pairs
does not generate a critical
log entry on potentially long requests) and negative first
values aren't
allowed.
User could pass use_tomap
flag (false
by default) to iterate over flat tuples or objects.
Example:
rv = crud.readview()
local tuples = {}
for _, tuple in rv:pairs('customers', {{'<=', 'age', 35}}, {use_tomap = false}) do
-- {5, 1172, 'Jack', 35}
table.insert(tuples, tuple)
end
local objects = {}
for _, object in rv:pairs('customers', {{'<=', 'age', 35}}, {use_tomap = true}) do
-- {id = 5, name = 'Jack', bucket_id = 1172, age = 35}
table.insert(objects, object)
end
rv:close()
crud
routers provide API to introspect spaces schema.
local schema, err = crud.update(space_name, opts)
where:
space_name
(?string
) - name of the space (if nil
, provides info for all spaces)opts
:
timeout
(?number
) - vshard.call
timeout and vshard master
discovery timeout (in seconds), default value is 2vshard_router
(?string|table
) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard clustercached
(?boolean
) - if false
, reloads storages schema on call;
if true
, return last known schema; default value is false
.
Beware that consequent calls with cached=true
do not guarantee
the same result if schema had chaned since net.box connections
still may perform reload on internal ping or any other requestReturns space schema (or spaces schema map), error.
Beware that schema info is not exactly the same as underlying storage spaces schema.
The reason is that crud
generates bucket_id
, if it isn't provided,
so this field is actually nullable for a crud
user. We also do not expose
bucket_id
index info since it's a vshard utility and do not related
to application logic.
Example:
crud.schema('customers')
---
- format:
- name: id
type: unsigned
- name: bucket_id
type: unsigned
is_nullable: true
- name: name
type: string
- name: age
type: number
indexes:
0:
unique: true
parts:
- fieldno: 1
type: unsigned
exclude_null: false
is_nullable: false
id: 0
type: TREE
name: primary_index
2:
unique: false
parts:
- fieldno: 4
type: number
exclude_null: false
is_nullable: false
id: 2
type: TREE
name: age
...
crud.schema()
---
- customers:
format: ...
indexes: ...
shops:
format: ...
indexes: ...
roles.crud-storage
is a Tarantool 3 role that initializes functions that
are used on the storage side to perform CRUD operations. Role must be enabled
on sharding storages.
cartridge.roles.crud-router
is a role that exposes public crud
functions
to the global scope so that you can call them via net.box
or with connectors.
Role must be enabled on sharding routers.
Roles support Tarantool 3.0.2, Tarantool 3.1.0 and newer. Older versions are not supported due to tarantool/tarantool#9643 and tarantool/tarantool#9649 issues.
Add crud
to dependencies in the project rockspec.
Note: it's better to use tagged version than scm-1
.
Check the latest available release tag and use it.
-- <project-name>-scm-1.rockspec
dependencies = {
...
'crud == <the-latest-tag>-1',
...
}
Add crud roles to your application configuration.
Application must be a sharded one.
It is required that roles.crud-storage
is enabled on each
sharding storage.
groups:
routers:
sharding:
roles:
- router
roles:
- roles.crud-router
replicasets:
router:
storages:
sharding:
roles:
- storage
roles:
- roles.crud-storage
replicasets:
s-1:
s-2:
Bootstrap vshard routers (for example, through app.module
section
in Tarantool 3 routers configuration).
-- myrouter.lua
local clock = require('clock')
local fiber = require('fiber')
local log = require('log')
local vshard = require('vshard')
local TIMEOUT = 60
local DELAY = 0.1
local start = clock.monotonic()
while clock.monotonic() - start < TIMEOUT do
local ok, err = vshard.router.bootstrap({
if_not_bootstrapped = true,
})
if ok then
break
end
log.info(('Router bootstrap error: %s'):format(err))
fiber.sleep(DELAY)
end
Set up your schema on storages (for example, through app.module
section
in Tarantool 3 storages configuration).
-- mystorage.lua
-- Schema setup is idempotent.
box.watch('box.status', function()
if box.info.ro then
return
end
local customers_space = box.schema.space.create('customers', {
format = {
{name = 'id', type = 'unsigned'},
{name = 'bucket_id', type = 'unsigned'},
{name = 'name', type = 'string'},
{name = 'age', type = 'number'},
},
if_not_exists = true,
})
customers_space:create_index('id', {
parts = { {field ='id', is_nullable = false} },
if_not_exists = true,
})
customers_space:create_index('bucket_id', {
parts = { {field ='bucket_id', is_nullable = false} },
if_not_exists = true,
})
customers_space:create_index('age', {
parts = { {field ='age'} },
unique = false,
if_not_exists = true,
})
end)
Start the application cluster. You can check whether asynchronous bootstrap
had finished through crud.storage_info()
calls on router.
Configure the statistics with roles configuration
(see crud.cfg
options in statistics section):
roles:
- roles.crud-router
roles_cfg:
roles.crud-router:
stats: true
stats_driver: metrics
stats_quantiles: true
stats_quantile_tolerated_error: 0.001
stats_quantile_age_buckets_count: 5
stats_quantile_max_age_time: 180
Now your cluster contains storages that are configured to be used for CRUD-operations. You can simply call CRUD functions on the router to insert, select, and update data across the cluster.
cartridge.roles.crud-storage
is a Tarantool Cartridge role that depends on the
vshard-storage
role, but also initializes functions that
are used on the storage side to perform CRUD operations.
cartridge.roles.crud-router
is a role that depends on the
vshard-router
role, but also exposes public crud
functions in the global
scope, so that you can call them via net.box
.
Add crud
to dependencies in the project rockspec.
Note: it's better to use tagged version than scm-1
.
Check the latest available release tag and use it.
-- <project-name>-scm-1.rockspec
dependencies = {
...
'crud == <the-latest-tag>-1',
...
}
Create the role that stores your data and depends on crud-storage
.
-- app.roles.customers-storage.lua
local cartridge = require('cartridge')
return {
role_name = 'customers-storage',
init = function()
local customers_space = box.schema.space.create('customers', {
format = {
{name = 'id', type = 'unsigned'},
{name = 'bucket_id', type = 'unsigned'},
{name = 'name', type = 'string'},
{name = 'age', type = 'number'},
},
if_not_exists = true,
})
customers_space:create_index('id', {
parts = { {field ='id', is_nullable = false} },
if_not_exists = true,
})
customers_space:create_index('bucket_id', {
parts = { {field ='bucket_id', is_nullable = false} },
if_not_exists = true,
})
customers_space:create_index('age', {
parts = { {field ='age'} },
unique = false,
if_not_exists = true,
})
end,
dependencies = {'cartridge.roles.crud-storage'},
}
-- app.roles.customers-router.lua
local cartridge = require('cartridge')
return {
role_name = 'customers-router',
dependencies = {'cartridge.roles.crud-router'},
}
Start the application and create customers-storage
and
customers-router
replica sets.
Don't forget to bootstrap vshard.
Configure the statistics with clusterwide configuration
(see crud.cfg
options in statistics section):
crud:
stats: true
stats_driver: metrics
stats_quantiles: true
stats_quantile_tolerated_error: 0.001
stats_quantile_age_buckets_count: 5
stats_quantile_max_age_time: 180
Now your cluster contains storages that are configured to be used for CRUD-operations. You can simply call CRUD functions on the router to insert, select, and update data across the cluster.
BSD-2-Clause. See the LICENSE file.