Open sanikolaev opened 1 week ago
HTTP client requests header
Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ=
dXNlcm5hbWU6cGFzc3dvcmQ=
base64-encoded username:password
MySQL client supports basic authorization prior to version 8. MySQL interface code has some auth basic code and need to add password check. HTTP interface has no any code - need to inplement it.
auth_passed
flag along with user
will be stored:
user login requests > daemon replies with the acccess token
. Acccess token
or session token
allows daemon to identify user.
After user logout all tokens invalidated. Tokens also got invalidated after a period of time.
HTTP client requests header after login
Authorization: Bearer <token>
token
is stored:
all requests from all interface SphinxQL (SSL) \ HTTPS should map into pair:
reqest type
such as: read
(select,meta, call), write
(insert\replace\update\delete\bulk), management
(create table, drop table, set var)table name
user
has allowed list of pairs (reqest types, tables names)
that got checked for every request via matching or RE2 rules
the flow after the daemon got the client request basic authorization:
1) req
> daemon check for user:password
at the req
> users[user]
get the user
and check that user.password
matches req.password
2) req
> daemon get the (reqest type,table name)
pair
3) user_rules_map[user]
get the allowed list that should be checked vs pair (reqest type,table name)
with the direct comparsion or RE2 rules matching - like:
3.1) any match should allow to process the request further as usual
3.2) no any entries matched should reject reqest with the proper error message and error code
the flow after the daemon got the client request with token authorization:
1) req
> daemon check for token
at the req
> active_users_map[token]
get the user
and check that (user,token)
has valid time windows less then invalidation time
2) req
> daemon get the (reqest type,table name)
pair
3) user_rules_map[user]
get the allowed list that should be checked vs pair (reqest type,table name)
with the direct comparsion or RE2 rules matching - like:
3.1) any match should allow to process the request further as usual
3.2) no any entries matched should reject reqest with the proper error message and error code
I suggest to reject all requests without SSL support such as HTTP \ SphinxQL \ API requests if user management enabled at daemon or ask user to keep them behind the firewall \ NAT along with the Galera interface.
We could allow to pass by requests without any checks via _vip
interfaces.
Daemon could pass by buddy requests:
{"reqest type":"*", "table name":"*"}
with SSL generated token and check that token passed into buddy on buddy start matched with that useruser-agent:Manticore Buddy
- however any client with such header pass by any checksConserns:
Not clean how to authorize Buddy or master - agent API requests:
searchd.int_ssl_*
options to make sure that master - agent and buddy requests encrypt with ssl and replies can be dencrypted via same key and allow user to change it.user:password
along with searchd.int_user
searchd.int_password
options and pass it into buddy on buddy start that buddy use for all requests into daemonand use the same for master - agent communication and allow user to change it.Not clear how to allow buddy to performs all requests but keep the user away from certain tables, ie
(?!system_table$).*
- could query all tables but not the system_table
The query to the daemon will fail select * from system_table
due to failed access tables rule. However if user adds select * from system_table option fuzzy=1
that fails query parsing at daemon and the raw text query got routed to buddy, the buddy could fix then issue the query and returns the result to daemon then daemon returns the result to client.
I think to store the hash user, password, allowed reqests type
at the manticore.conf
for static config or manticore.json
for RT mode. All change of that hash (add, delete user or role or rule) should be flushed at the manticore.json
.
I pushed the branch req_regex there add 100 regex patterns matching for every search query after that got enabled via
mysql -h 127.0.0.1 -P 9306 -e "set global regex=1"
and see the loop for regex matching every pattern vs SphinxQL statement text in this mode adds from 1ms initially to 0.1ms for all subsequent invocations
I tested short queries up to 128 bytes
mysql -h0 -P 9306 -e "SELECT id, uuid_short() as i101, uuid_short() as i102, uuid_short() as i103, uuid_short() as i104 from name order by id asc;"
along with large queries up to 1kb
mysql -h0 -P 9306 -e "SELECT id, uuid_short() as i201, uuid_short() as i202, uuid_short() as i203, uuid_short() as i204
, uuid_short() as i101, uuid_short() as i102, uuid_short() as i103, uuid_short() as i104
, uuid_short() as i111, uuid_short() as i112, uuid_short() as i113, uuid_short() as i114
, uuid_short() as i121, uuid_short() as i122, uuid_short() as i123, uuid_short() as i124
, uuid_short() as i131, uuid_short() as i132, uuid_short() as i133, uuid_short() as i134
, uuid_short() as i141, uuid_short() as i142, uuid_short() as i143, uuid_short() as i144
, uuid_short() as i151, uuid_short() as i152, uuid_short() as i153, uuid_short() as i154
, uuid_short() as i161, uuid_short() as i162, uuid_short() as i163, uuid_short() as i164
, uuid_short() as i171, uuid_short() as i172, uuid_short() as i173, uuid_short() as i174
, uuid_short() as i181, uuid_short() as i182, uuid_short() as i183, uuid_short() as i184
from name order by id asc; "
the timing got logged into searchd.log after search finished well as
[Mon Nov 18 16:35:54.245 2024] [2552564] regex patterns check: 100, took: 1.054 ms
[Mon Nov 18 16:35:54.245 2024] [2552564] regex matched
[Mon Nov 18 16:38:33.945 2024] [2552579] regex patterns check: 100, took: 0.153 ms
for the users auth replication:
-- could keep all rules at the system.users
table and replicate it to new node on user request or when a new node first join to any cluster from the donor node. However this means that all nodes will have the same system.users
and admin can not set per node rules for users.
Another concern that new node could join the cluster at the node with the users roles set but that new node that has no users roles. And as the client request to the new node goes via SphinxQL interface into new node then the new node communicate with the donor via API interface and I dont plan to add user auth to the API interface new node could bypass the auth at the donor node.
Maybe worth to think and prevent this.
As discussed, let’s carefully consider these items:
SELECT
query before routing it to Buddy.Privileges and Groups: We decided to avoid regex for performance reasons. Let’s define which specific privileges (e.g., insert, replace, delete, update) and privilege groups (e.g., write = insert + replace + delete + update) make sense for Manticore.
Could allow to set group \ aliases in the config from the SphinxQL statement names:
searchd.users_group_read = select,show_meta,call,desc,show_profile
searchd.users_group_write_tnx = insert,replace,delete,set
When use read
, write_tnx
group names to check of the allowed statements for userAuthentication for Queries from Buddy: Think how to ensure secure communication with Buddy, preventing anyone from impersonating Buddy to bypass authentication.
Could pass special buddy user name and generated password and buddy will send HTTP requests to daemon with the Authorization: Basic ...
or generated auth token and buddy will send HTTP requests to daemon with the Authorization: Bearer <token>
on the buddy start via buddy cli
SQL Parsing Issue: Regarding the possible issue with "SELECT * FROM t OPTION blabla=1," evaluate if we can identify it as a SELECT query before routing it to Buddy.
If SphinxQL query got failed to parse it still has statement set most of the time. But parser could not parse index name if the select list of the query has some error and parsing failed there.
Another approach is to pass user to buddy then buddy route that user back to daemon for every fixed request to make sure daemon will authorize all buddy requests as that user requests. Maybe there is some standart proxy request that use its own auth but also route user auth with the original request.
(size_M) login and rejects via MySQL client with hardcoded user to daemon
(size_M) load users from the plain config serchd.users
and authenticate these users from the config instead hardcoded one
(size_S) login via HTTP(S) with the basic auth
(size_M) generate password \ token to buddy cli and authenticate own buddy requests. Need someone from the buddy team of how to check original user permission for the requests from the buddy
(size_M) add groups \ aliases as user privileges for the plain config serchd.groups
and authorize MySQL and HTTP user requests vs the user \ privilege pair via raw matching \ comparison (wo regex)
(size_L) add RT mode SphinxQL statements for auth management and store\load this data in\from the manticore.json
(size_L) add user \ password and checks to all binary API communications between master and agent
(size_L) add user \ password and checks to all binary API communications between daemon and various API clients
(size_L) replicate auth table(s) via nodes. Fix user cluster join requests \ code to join auth cluster first and use auth to validate user requests
Proposal:
The task is to design optimal architecture for adding authentication features to Manticore.
Checklist:
To be completed by the assignee. Check off tasks that have been completed or are not applicable.