Open kallaballa opened 6 years ago
@ipapapa stated on gitter that he (nor anyone else i guess) hasn't looked into it. I will try to find out what it takes to implement kyoto tycoon support. Btw. the main protocol (it also support others) is a TSV-RPC implementation: http://fallabs.com/kyototycoon/spex.html#protocol
Some good information can also be found in #310. I specifically describe the Mongo work which can be used as a driver for this work. Adding a parser first requires adding a protocol parser in proto. Currently, there is Memcached and Redis, with Redis being the one we use in production.
Ok.. I did read through the whole wiki and i tried to wrap my had around code in proto. But there is something i really don't understand and that is the msg struct. Could you eloborate on it?
@kallaballa The msg struct comes as part of the fork from twemproxy. The best way to analyze the struct is to look at it under GDB to see how it's accessed/modified during a query.
Some of the important fields are: struct mhdr mhdr; This is the 'mbuf' header and is initialized here: https://github.com/Netflix/dynomite/blob/dev/src/dyn_mbuf.h#L47 It contains a list of 'mbuf'(s) which are the buffer(s) that contain the request/response strings.
c_tqe, s_tqe and m_tqe are entries in a linked list and an instance of the msg struct can be a part of any one of those (depending on client/server, etc.)
'parser' is the function called on the query to determine how to parse it according to the underlying storage layer used. This is where the parser is set according to the storage layer: https://github.com/Netflix/dynomite/blob/dev/src/dyn_message.c#L414-L440
'rsp_handler' executes some logic when a response is received. For example, under DC_SAFE_QUORUM, it makes sure that the checksums of the data from different replicas match: https://github.com/Netflix/dynomite/blob/dev/src/dyn_client.c#L1025-L1039
Hope this helps.
Note: Over time, we will be adding more code documentation, and be simplifying the structs to make the code easier to understand.
I started on implementing the parsing code: https://github.com/kallaballa/dynomite/commit/8b20830834f65c1a339f8edc2054a8fd70abf536
The code compiles but it is completely untested. Also please note that I'm a C++ dev and don't know my way around C well. Am i going somewhere with this?
You also need to add data_store
number so that one can choose your proto and I assume there are more changes to be made. In the end, it would be nice to expand the testing framework to add some unit/functional testing.
Ok. i've got it to the point where dynomite is able to forward messages between a single node and a single client: https://github.com/Netflix/dynomite/compare/dev...kallaballa:dev But the parser code is error prone and cumbersome. Would it be acceptable if i used https://github.com/h2o/picohttpparser for parsing?
I completed the parser code and it works fine. dynomite is forwarding messages correctly. how do i proceed? e.g.: how do i make dynomite aware of keys.
I am not following your question. If Dynomite is doing what it is supposed to do, what do you mean by Dynomite being aware of the keys?
I thought that following functionality needs dynomite to be aware of what keys are present in which node.
A client can connect to any node on a Dynomite cluster when sending write traffic. If the Dynomite node happens to own the data based on its token, then the data is written to the local datastore server process and asynchronously replicated to other racks in the cluster across all data centers. If the node does not own the data, it acts as a coordinator and sends the write to the node owning the data in the same rack. It also replicates the writes to the corresponding nodes in other racks and DCs.
Let me phrase it more precise. At the moment all the kyoto tycoon code does is parse the message and discard keys and values. I can see that the redis code and the memcached code does more than that. e.g. they manipulate msg->keys and msg->token. What do those fields do. Are there other fields i have to setup?
@kallaballa 'keys' basically holds the array of all keys mentioned in the query. Eg: The Redis query 'smembers myset', would have mean that the 'keys' array has one element called 'myset'.
It's then used in other places for additional verification. Eg: We check the 'keys' array here to make sure that all the keys accessed for an EVAL query are on the same node: https://github.com/Netflix/dynomite/blob/22b5022c1cb55c7d2530689691823d92a25078bc/src/proto/dyn_redis.c#L3139-L3159
The 'token' field is used internally for pointing the the current token being processed. For example, the above query from Redis, looks like the following over the wire when we start parsing the request: (gdb) print r->token $1 = (uint8_t ) 0x5555557e6860 "2\r\n$8\r\nsmembers\r\n$5\r\nmyset\r\n"
As the query is being parsed, the token field points to the next token after the first one is parsed: (gdb) print r->token $4 = (uint8_t *) 0x5555557e6864 "$8\r\nsmembers\r\n$5\r\nmyset\r\n"
(gdb) print r->token $10 = (uint8_t *) 0x5555557e6872 "$5\r\nmyset\r\n"
... and so on. (You can refer to this doc to understand the redis wire protocol if you want to draw parallels to Kyoto Tycoon: https://redis.io/topics/protocol)
thank you so much. I think i got enough to continue on.
is msg->pos always a null terminated string or is there another way to determine its length?
I found the answer, the length can be determined by: b->last - b->pos. e.g: https://github.com/kallaballa/dynomite/blob/aab424f44f8b43d51083ad735834d1f4dd513128/src/proto/dyn_tycoon.c#L283
I've successfully implemented parsing of set, get and remove command. Additionally keys are parsed and added to the msg. Now i have 2 questions:
For (1): Do you have a functional prototype? It would drive you through what is needed. You can deploy a single node Dynomite with a Kyoto Tycoon backend to figure out what is missing. Eventually, we might want to add some more testing scripts for Kyoto Tycoon so changes made at Dynomite are properly supported by the data stores (we usually try not to break other storage engines but having some test coverage is a good methodology). (2) Redis SCAN command (and its variants) is a cursor-based command. You might want to start by looking at that.
I have implemented all commands my application needs (including cursor based commands) and it works in a single node setup if i turn the dynomite verbosity level to 8 or more. Else it breaks. Any idea what might cause this?
OK, i found the bug that broke the parser, though i still have no clue why behavior changed with verbosity level. How would i test a mutli node setup? e.g. How do i test if conflict resolution and sharding does work?
I created a 3 nodes in one rack setup by using following conf files: https://github.com/kallaballa/dynomite/blob/dev/conf/tycoon_rack1_node1.yml https://github.com/kallaballa/dynomite/blob/dev/conf/tycoon_rack1_node2.yml https://github.com/kallaballa/dynomite/blob/dev/conf/tycoon_rack1_node3.yml
Replication doesn't seem to work. When i set a key on node1 that key doesn't exist on node2 and node3. Any ideas? Could you maybe have a look at the code and find what's missing? https://github.com/Netflix/dynomite/compare/dev...kallaballa:dev
Hi @kallaballa . It may be the case that some debug/higher verbosity statements may have some implicit behavior changing effects which are definitely bugs, and if you find them, including a fix for that would be a great idea as well.
The best way to simulate a cluster is to look at the script in test/cluster_generator.py. That is a test script that sets up a simulated multi-node Dynomite backed by Redis. Maybe you could try something similar with KyotoTycoon and verifhy its working.
Well i think we are not there yet. The code I've written is full of guesses since all I did is trying to mimic what other parsers do. I'm pretty sure the code is missing out on a lot that goes on in dynomite. It would be really nice if someone could look at the code and guide me. I will still look into creating a cluster generator.
@kallaballa I can have a look at it, but my review might be slow since I don't have too much context on Kyoto Tycoon yet. I would suggest that if you make improvements, you can continue to add them. Additionally, documenting non-trivial parts of the code would be very helpful.
@kallaballa how far are you in filling a PR?
I'd like to ask if there are any plans to support kyoto tycoon as a backend. And if not, I'd like to find out what it takes to implement it, since i might do it.