Add benchmark scripts adopted from uvloop

messense commented 7 years ago

Tried running echoserver and echoclient locally.

Start server:

python echoserver.py --tokio

Memory usage before running benchmark:

Start client:

python echoclient.py --addr 127.0.0.1:25000

Memory usage after running benchmark:

Possible memory leaks? I tried run echoserver with asyncio event loop, it has no such problem.

cc #43

fafhrd91 commented 7 years ago

I know about memory leak. Just didn't have time for it.

fafhrd91 commented 7 years ago

There is also some performance problem compared to 0.1 implementation. Possible related to how task get scheduled

fafhrd91 commented 7 years ago

GC is disabled for PyFuture. Might be memory leak source?

messense commented 7 years ago

Some naive results on my MacBook Pro (Retina, 15-inch, Mid 2015):

tokio loop:

will connect to: ('127.0.0.1', 25000)
Sending 200000 messages
Sending 200000 messages
Sending 200000 messages
600000 in 24.256758213043213
24735.374559547337 requests/sec

asyncio loop:

will connect to: ('127.0.0.1', 25000)
Sending 200000 messages
Sending 200000 messages
Sending 200000 messages
600000 in 37.17765283584595
16138.727279240502 requests/sec

uvloop loop:

will connect to: ('127.0.0.1', 25000)
Sending 200000 messages
Sending 200000 messages
Sending 200000 messages
600000 in 12.358658790588379
48548.95746914903 requests/sec

fafhrd91 commented 7 years ago

Whats number for uvloop?

I noticed libuv is significantly faster for scheduling next loop iteration (event_loop.call_soon method)

messense commented 7 years ago

@fafhrd91 Added uvloop benchmark result in the comment above.

fafhrd91 commented 7 years ago

Could you also test async-tokio 0.1?

messense commented 7 years ago

tokio 0.1.0 loop:

Sending 200000 messages
Sending 200000 messages
Sending 200000 messages
600000 in 21.79446029663086
27529.931543785566 requests/sec

No memory leaks issue.

fafhrd91 commented 7 years ago

ok, good. there should some obvious code path that makes tokio 0.2 slower.

I have simple script:

import tokio

STOP = False
Count = 0

def cb(loop):
    global Count, STOP
    Count += 1

    if not STOP:
        loop.call_soon(cb, loop)

def stop():
    global STOP
    STOP = True

def main(loop):
    for _ in range(10):
        loop.call_soon(cb, loop)

    loop.call_later(5.0, loop.stop)
    loop.run_forever()
    print(Count)

loop = tokio.new_event_loop()

main(loop)

it is faster by 15% with pyo3 compared to rust-cpython but it is still 3x slower than uvloop.

fafhrd91 commented 7 years ago

I'd like to release pyo3 first before do any work on async-tokio. you are free to optimize it

messense commented 7 years ago

Enabled GC for PyFuture and PyTask doesn't change much, still consumes a lot of memory.

diff --git a/src/pyfuture.rs b/src/pyfuture.rs
index 1739ec8..83e8d91 100644
--- a/src/pyfuture.rs
+++ b/src/pyfuture.rs
@@ -749,12 +749,12 @@ impl PyFuture {
     }
 }

-/*#[py::proto]
+#[py::proto]
 impl PyGCProtocol for PyFuture {
     //
     // Python GC support
     //
-    fn __traverse__(&self, _py: Python, visit: PyVisit) -> Result<(), PyTraverseError> {
+    fn __traverse__(&self, visit: PyVisit) -> Result<(), PyTraverseError> {
         if let Some(ref callbacks) = self.fut.callbacks {
             for callback in callbacks.iter() {
                 visit.call(callback)?;
@@ -763,15 +763,15 @@ impl PyGCProtocol for PyFuture {
         Ok(())
     }

-    fn __clear__(&mut self, py: Python) {
-        let callbacks = mem::replace(&mut self.fut.callbacks, None);
+    fn __clear__(&mut self) {
+        let callbacks = std::mem::replace(&mut self.fut.callbacks, None);
         if let Some(callbacks) = callbacks {
             for cb in callbacks {
-                py.release(cb);
+                self.py().release(cb);
             }
         }
     }
-}*/
+}

 #[py::proto]
 impl PyAsyncProtocol for PyFuture {
diff --git a/src/pytask.rs b/src/pytask.rs
index 17a63aa..a9f52f4 100644
--- a/src/pytask.rs
+++ b/src/pytask.rs
@@ -254,12 +254,12 @@ impl PyTask {
     }
 }

-/*#[py::proto]
+#[py::proto]
 impl PyGCProtocol for PyTask {
     //
     // Python GC support
     //
-    fn __traverse__(&self, _py: Python, visit: PyVisit) -> Result<(), PyTraverseError> {
+    fn __traverse__(&self, visit: PyVisit) -> Result<(), PyTraverseError> {
         if let Some(ref callbacks) = self.fut.callbacks {
             for callback in callbacks.iter() {
                 let _ = visit.call(callback);
@@ -268,15 +268,15 @@ impl PyGCProtocol for PyTask {
         Ok(())
     }

-    fn __clear__(&mut self, py: Python) {
-        let callbacks = mem::replace(&mut self.fut.callbacks, None);
+    fn __clear__(&mut self) {
+        let callbacks = std::mem::replace(&mut self.fut.callbacks, None);
         if let Some(callbacks) = callbacks {
             for cb in callbacks {
-                py.release(cb);
+                self.py().release(cb);
             }
         }
     }
-}*/
+}

 #[py::proto]
 impl PyObjectProtocol for PyTask {

fafhrd91 commented 7 years ago

I found one leak in pyo3. but there is another in tokio

fafhrd91 commented 7 years ago

btw did you build async-tokio in release mode?

fafhrd91 commented 7 years ago

PyFuture leaks

fafhrd91 commented 7 years ago

memory leak fixed

messense commented 7 years ago

btw did you build async-tokio in release mode?

Yes.

memory leak fixed

Great!

messense commented 7 years ago

Updated async-tokio benchmark result at current master (4a6dde427865c8f2c91c8abbc03793e693148364)

will connect to: ('127.0.0.1', 25000)
Sending 200000 messages
Sending 200000 messages
Sending 200000 messages
600000 in 18.8026340007782
31910.422761787915 requests/sec

fafhrd91 commented 7 years ago

could you benchmark again

messense commented 7 years ago

f7db658

will connect to: ('127.0.0.1', 25000)
Sending 200000 messages
Sending 200000 messages
Sending 200000 messages
600000 in 18.51114511489868
32412.905645534076 requests/sec

fafhrd91 commented 7 years ago

could you run benchmark again

messense commented 7 years ago

1178b3205c7ea1dffa27e85d9ebb5f58d1398253

will connect to: ('127.0.0.1', 25000)
Sending 200000 messages
Sending 200000 messages
Sending 200000 messages
600000 in 17.38876962661743
34505.02898615465 requests/sec

PyO3 / tokio

Add benchmark scripts adopted from uvloop #44