dragonflydb / dragonfly

A modern replacement for Redis and Memcached
https://www.dragonflydb.io/
Other
25.21k stars 908 forks source link

improve our pytest framework #199

Closed romange closed 2 years ago

romange commented 2 years ago

Currently our pytest framework in tests/pytest is very naive. It assumes that dragongly process is already running.

  1. We should introduce the fixture that assumes the binary exists under gitroot/build_dbg (if DFLY_BIN env is undefined). this fixture should start a dragonfly process at the beginning of pytest run and be used by multiple tests. Each test should get an empty db (after flushall).

  2. We should be able to parametrize this fixture in order to run dragonfly process with different flags. Tests would define which configuration they need. For example, some tests may want to test memcache protocol and access dragonfly process with memcached_port=... defined.

romange commented 2 years ago
  1. This is very high level, I am not very familiar with pytest but based on my little knowledge of pytest it can be done there easily.
  2. Finally, I want README under tests/ that explains how to run the tests under pytest.
shmulik-klein commented 2 years ago

@romange If not assigned, I'll take this one.

  1. Is DFLY_BIN is already used? can it be more explicit such as DRAGONFLY_HOME or DRAGONFLY_BIN?
  2. In the fallback, did you mean gitroot/build-opt?
romange commented 2 years ago

go for it and yes

shmulik-klein commented 2 years ago

Breaking down this issue into subtasks, here is the current status:

romange commented 2 years ago

here is the test you could do with another configuration. run process dragonfly --proactor_threads=1 --max_memory=2097152,

and within the test start adding string entries with blobs say of 1K, after ~2000 keys you should start getting "out of memory" error at some point and this is what the test could check. This way you will have a useful example of running different configurations of dragonfly. @shmulik-klein btw, Dragonfly rejects writes when it goes out of memory, unlike Redis which just crashes.

Nike682631 commented 2 years ago

@romange Is this issue available to take up?

romange commented 2 years ago

Yes, @Nike682631 - that would be helpful

Nike682631 commented 2 years ago

Yes, @Nike682631 - that would be helpful

Please assign

Nike682631 commented 2 years ago

@romange I have tried to go through the docs. I have understood what dragonflydb is about but needed some help on how I need to dive into the codebase. Can you gimme some suggestions on how I can get started on this?

romange commented 2 years ago

Did you understand what needs to be done in this issue @Nike682631 ?

Nike682631 commented 2 years ago

@romange I believe the below two are the requirements for completing this issue.

  1. Parse tests requirements and pass as flags to the executable
  2. Add a README for pytest tests execution guidelines For 1. I believe somehow test requirements need to be scanned somehow and then appropriate flags must be passed onto the commands for execution of tests and check expected results. For 2. I believe some guidelines need to be added for execution of these tests.
romange commented 2 years ago

@Nike682631 I am sorry but this task requires someone who is knowledgeable with python, pytests framework and redis.

Nike682631 commented 2 years ago

@romange Well I can study about the required stuff to work on the issue?

romange commented 2 years ago

@Nike682631 I am sorry but this task requires someone who is knowledgeable with python, pytests framework and redis. I suspect you are not this person. Unfortunately, I do not have the capacity to teach these things.

Nike682631 commented 2 years ago

@romange dw I don't require you to teach me. I'll take a look at em myself. Will try to take a hack at it and inform here if I made any progress.

Nike682631 commented 2 years ago

Hi @romange , So I've been going through test_dragonfly.py trying to understand what exactly is going on in the code. I believe the below snippet of code is the implementation for

We should introduce the fixture that assumes the binary exists under gitroot/build_dbg (if DFLY_BIN env is undefined). this fixture should start a dragonfly process at the beginning of pytest run and be used by multiple tests. Each test should get an empty db (after flushall).

@pytest.fixture(scope="module")
def df_server():
    """ Starts a single DragonflyDB process, runs only once. """
    dragonfly_path = os.environ.get("DRAGONFLY_HOME", os.path.join(
        SCRIPT_DIR, '../../build-dbg/dragonfly'))
    print("Starting DragonflyDB [{}]".format(dragonfly_path))
    # TODO: parse arguments and pass them over
    p = subprocess.Popen([dragonfly_path])
    time.sleep(0.1)
    return_code = p.poll()
    if return_code is not None:
        pytest.exit("Failed to start DragonflyDB [{}]".format(dragonfly_path))

    yield

    print("Terminating DragonflyDB process [{}]".format(p.pid))
    try:
        p.terminate()
        outs, errs = p.communicate(timeout=15)
    except subprocess.TimeoutExpired:
        print("Unable to terminate DragonflyDB gracefully, it was killed")
        outs, errs = p.communicate()
        print(outs)
        print(errs)

So to implement

We should be able to parametrize this fixture in order to run dragonfly process with different flags. Tests would define which configuration they need. For example, some tests may want to test memcache protocol and access dragonfly process with memcached_port=... defined.

I believe I need to pass some sort of parameter for flags in this df_server fixture and based on the definition for those flags the tests should be able to run dragonfly accordingly. There are still some instances I'm trying to understand in the code but feel free to correct me if I'm going in the wrong direction. Will keep brainstorming until then.

romange commented 2 years ago

For simplicity sake i would introduce an additional fixture that runs df server with other set of flags (see my example for the test i suggest to add) and use that fixture in test.

Nike682631 commented 2 years ago

@romange Is it possible to make a dragonfly build from source on archlinux?

romange commented 2 years ago

I believe it's possible but I've never used archlinux and I do not know how to run it on VirtualBox.

romange commented 2 years ago

But if you use archlinux on your host and you do not know how to build dragonfly, you may run ubuntu 20.04 via VirtualBox (you would need to learn how to do it - I do not know myself) and then build dragonfly there using the instructions on our contributing guide

Nike682631 commented 2 years ago

@romange So I was able to build dragonfly in arch but I had to make some changes

Steps

Nike682631 commented 2 years ago

Will add this to documentation once I'm able to fix this issue.

Nike682631 commented 2 years ago

Hi @romange , So I added this below code in the already existing test file.

@pytest.fixture(scope="module")
def df_server_with_flag():
    """ Starts a single DragonflyDB process, runs only once. """
    dragonfly_path = os.environ.get("DRAGONFLY_HOME", os.path.join(
        SCRIPT_DIR, '../../build-opt/dragonfly'))
    print("Starting DragonflyDB [{}]".format(dragonfly_path))
    # TODO: parse arguments and pass them over
    p = subprocess.Popen([dragonfly_path, " --proactor_threads=1", " --max_memory=2097152"])
    time.sleep(0.1)
    return_code = p.poll()
    if return_code is not None:
        pytest.exit("Failed to start DragonflyDB [{}]".format(dragonfly_path))

    yield

    print("Terminating DragonflyDB process [{}]".format(p.pid))
    try:
        p.terminate()
        outs, errs = p.communicate(timeout=15)
    except subprocess.TimeoutExpired:
        print("Unable to terminate DragonflyDB gracefully, it was killed")
        outs, errs = p.communicate()
        print(outs)
        print(errs)

@pytest.fixture(scope="module")
def connection_with_flag(df_server_with_flag):
    pool = redis.ConnectionPool(decode_responses=True)
    client_with_flag = redis.Redis(connection_pool=pool)
    return client_with_flag

@pytest.fixture
def client_with_flag(connection_with_flag):
    """ Flushes all the records, runs before each test. """
    connection_with_flag.flushall()
    return connection_with_flag

class BLPopWorkerThreadFlagged:
    def __init__(self):
        self.result = None
        self.thread = None

    def async_blpop(self, client_with_flag: redis.Redis):
        self.result = None

        def blpop_task(self, client_with_flag):
            self.result = client_with_flag.blpop(
                ['list1{t}', 'list2{t}', 'list2{t}', 'list1{t}'], 0.5)

        self.thread = Thread(target=blpop_task, args=(self, client_with_flag))
        self.thread.start()

    def wait(self, timeout):
        self.thread.join(timeout)
        return not self.thread.is_alive()

@pytest.mark.parametrize('index', range(1000))
def test_blpop_multiple_keys_with_flag(client_with_flag: redis.Redis, index):
    wt_blpop = BLPopWorkerThreadFlagged()
    wt_blpop.async_blpop(client_with_flag)

    client_with_flag.lpush('list1{t}', 'a')
    assert wt_blpop.wait(2)
    assert wt_blpop.result[1] == 'a'
    watched = client_with_flag.execute_command('DEBUG WATCHED')
    assert watched == []

    wt_blpop.async_blpop(client_with_flag)
    client_with_flag.lpush('list2{t}', 'b')
    assert wt_blpop.wait(2)
    assert wt_blpop.result[1] == 'b'

It seems to be passing!!! image

I know I might've added too much code. For the time being I was not able to figure out how to introduce a new fixture without changing other functions. I think I should be able to do it by passing some sort of parameters or something. Will try to see how to make this better

Will keep brainstorming!!

Nike682631 commented 2 years ago

Hi @romange , Just wanted to confirm with you. The last two messages that I have posted. Do you think I'm going in the right direction? Happy to know any feedback you might have!!

romange commented 2 years ago

yes, if you add a new fixture you will need to adapt functions that depend on it. And yes, it's too much code. Some refactoring is required. Also, you should improve the naming. Call df_server_with_flag as df_server_mem_capped or something.

The intent is not to run the same test test_blpop_multiple_keys but to introduce a different test that actually requires df_server_mem_capped to work. For example, the test that adds string entries with blobs say of 1K. At some point (after approximately 2000 keys) it should start getting "out of memory" errors and this is what the test could check.

romange commented 2 years ago

@braydnm I am assigning to you

Nike682631 commented 2 years ago

@romange I was kinda working for this. Got a little late because of the end sem exams. Is this issue already resolved? I can see some commits from @braydnm but not sure about it

romange commented 2 years ago

It seems that @braydnm cracked that down.

Nike682631 commented 2 years ago

I see. Sure I'll work on other issues.