ccp-project / eval-scripts

ISC License
5 stars 4 forks source link

Panic when running BBR examples #2

Closed charles-typ closed 4 years ago

charles-typ commented 4 years ago

Hi

I was trying to run the BBR example in README.md but encountered some problems. I launched the vagrant container and followed the steps below:

Load ccp-kernel: cd ccp-kernel && sudo ./ccp_kernel_load ipc=0 Start BBR: cd bbr && sudo ./target/release/bbr --ipc=netlink Start an iperf server: iperf -s -p 5000 Start an iperf client with ccp inside mahimahi and with logging: mm-delay 10 mm-link --cbr 48M 48M --uplink-queue="droptail" --downlink-queue="droptail" --uplink-queue-args="packets=160" --downlink-queue-args="packets=160" iperf -c $MAHIMAHI_BASE -p 5000 -t 30 -i 1 -Z ccp Graph the result: mm-graph mahimahi.log 20

When I start the iperf request, the CCP caught a panic here:

vagrant@ubuntu1710:/ccp$ cd bbr && sudo ./target/release/bbr --ipc=netlink
│Feb 17 05:03:20.981 INFO configured BBR, probe_rtt_interval: Duration { secs: 10, nanos: 0 }, ipc: netlink
│Feb 17 05:03:20.987 INFO starting CCP, ipc: netlink, algorithm: bbr
│Feb 17 05:05:15.032 INFO new_flow
│Feb 17 05:05:15.032 INFO switching to PROBE_BWthread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("Cannot up
│date field: \"Report.minrtt\"")', , min rtt (us): 22957, Rate (5/4): /ccp/bbr/src/lib.rs:132:9
│note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
│6.262, bottle rate (Mbps): 5.0096, Rate (3/4): 3.7572, cwnd: 28751

Is this something expected? Thank you so much for your time, and I really appreciate your help!

akshayknarayan commented 4 years ago

Hi, Could you try using the latest ccp-kernel commit (https://github.com/ccp-project/ccp-kernel/commit/da796529b5e89bde9a58c433a8c87c05b29a7fbc), and running the latest commit of ccp-bbr (https://github.com/ccp-project/bbr/commit/3b2752a4b50d92fc1e46c3559a9f28c0e1878014)?

Thanks, Akshay

charles-typ commented 4 years ago

Hi Akshay,

Thank you so much the the quick response!

I've updated the ccp-kernel and ccp-bbr to the latest commit in the vagrant container, but the ccp-kernel failed to compile. Please kindly check out the error logs below:

vagrant@ubuntu1710:/ccp/ccp-kernel$ make
make -C /lib/modules/4.13.0-46-generic/build M=/ccp/ccp-kernel modules
make[1]: Entering directory '/usr/src/linux-headers-4.13.0-46-generic'
  CC [M]  /ccp/ccp-kernel/tcp_ccp.o
/ccp/ccp-kernel/tcp_ccp.c: In function ‘tcp_ccp_init’:
/ccp/ccp-kernel/tcp_ccp.c:337:17: error: too many arguments to function ‘ccp_connection_start’
     cpl->conn = ccp_connection_start(kernel_datapath, (void *) sk, &dp_info);
                 ^~~~~~~~~~~~~~~~~~~~
In file included from /ccp/ccp-kernel/tcp_ccp.h:6:0,
                 from /ccp/ccp-kernel/tcp_ccp.c:1:
/ccp/ccp-kernel/libccp/ccp.h:188:24: note: declared here
 struct ccp_connection *ccp_connection_start(void *impl, struct ccp_datapath_info *flow_info);
                        ^~~~~~~~~~~~~~~~~~~~
/ccp/ccp-kernel/tcp_ccp.c: In function ‘tcp_ccp_release’:
/ccp/ccp-kernel/tcp_ccp.c:357:29: warning: passing argument 1 of ‘ccp_connection_free’ makes integer from pointer without a cast [-Wint-conversion]
         ccp_connection_free(kernel_datapath, cpl->conn->index);
                             ^~~~~~~~~~~~~~~
In file included from /ccp/ccp-kernel/tcp_ccp.h:6:0,
                 from /ccp/ccp-kernel/tcp_ccp.c:1:
/ccp/ccp-kernel/libccp/ccp.h:193:6: note: expected ‘u16 {aka short unsigned int}’ but argument is of type ‘struct ccp_datapath *’
 void ccp_connection_free(u16 sid);
      ^~~~~~~~~~~~~~~~~~~
/ccp/ccp-kernel/tcp_ccp.c:357:9: error: too many arguments to function ‘ccp_connection_free’
         ccp_connection_free(kernel_datapath, cpl->conn->index);
         ^~~~~~~~~~~~~~~~~~~
In file included from /ccp/ccp-kernel/tcp_ccp.h:6:0,
                 from /ccp/ccp-kernel/tcp_ccp.c:1:
/ccp/ccp-kernel/libccp/ccp.h:193:6: note: declared here
 void ccp_connection_free(u16 sid);
      ^~~~~~~~~~~~~~~~~~~
/ccp/ccp-kernel/tcp_ccp.c: At top level:
/ccp/ccp-kernel/tcp_ccp.c:383:44: warning: ‘enum ccp_log_level’ declared inside parameter list will not be visible outside of this definition or declaration
 void ccp_log(struct ccp_datapath *dp, enum ccp_log_level level, const char* msg, int msg_size) {
                                            ^~~~~~~~~~~~~
/ccp/ccp-kernel/tcp_ccp.c:383:58: error: parameter 2 (‘level’) has incomplete type
 void ccp_log(struct ccp_datapath *dp, enum ccp_log_level level, const char* msg, int msg_size) {
                                                          ^~~~~
/ccp/ccp-kernel/tcp_ccp.c:383:6: error: function declaration isn’t a prototype [-Werror=strict-prototypes]
 void ccp_log(struct ccp_datapath *dp, enum ccp_log_level level, const char* msg, int msg_size) {
      ^~~~~~~
/ccp/ccp-kernel/tcp_ccp.c: In function ‘ccp_log’:
/ccp/ccp-kernel/tcp_ccp.c:385:10: error: ‘ERROR’ undeclared (first use in this function); did you mean ‘ERR_PTR’?
     case ERROR:
          ^~~~~
          ERR_PTR
/ccp/ccp-kernel/tcp_ccp.c:385:10: note: each undeclared identifier is reported only once for each function it appears in
/ccp/ccp-kernel/tcp_ccp.c:386:10: error: ‘WARN’ undeclared (first use in this function)
     case WARN:
          ^~~~
/ccp/ccp-kernel/tcp_ccp.c:387:10: error: ‘INFO’ undeclared (first use in this function); did you mean ‘NR_INFO’?
     case INFO:
          ^~~~
          NR_INFO
/ccp/ccp-kernel/tcp_ccp.c:388:10: error: ‘DEBUG’ undeclared (first use in this function); did you mean ‘BUG’?
     case DEBUG:
          ^~~~~
          BUG
/ccp/ccp-kernel/tcp_ccp.c:389:10: error: ‘TRACE’ undeclared (first use in this function)
     case TRACE:
          ^~~~~
/ccp/ccp-kernel/tcp_ccp.c: In function ‘tcp_ccp_register’:
/ccp/ccp-kernel/tcp_ccp.c:415:20: error: ‘struct ccp_datapath’ has no member named ‘max_connections’
     kernel_datapath->max_connections = MAX_ACTIVE_FLOWS;
                    ^~
/ccp/ccp-kernel/tcp_ccp.c:416:20: error: ‘struct ccp_datapath’ has no member named ‘ccp_active_connections’
     kernel_datapath->ccp_active_connections =
                    ^~
/ccp/ccp-kernel/tcp_ccp.c:418:24: error: ‘struct ccp_datapath’ has no member named ‘ccp_active_connections’
     if(!kernel_datapath->ccp_active_connections) {
                        ^~
/ccp/ccp-kernel/tcp_ccp.c:423:20: error: ‘struct ccp_datapath’ has no member named ‘max_programs’
     kernel_datapath->max_programs = MAX_DATAPATH_PROGRAMS;
                    ^~
/ccp/ccp-kernel/tcp_ccp.c:424:31: error: assignment from incompatible pointer type [-Werror=incompatible-pointer-types]
     kernel_datapath->set_cwnd = &do_set_cwnd;
                               ^
/ccp/ccp-kernel/tcp_ccp.c:425:35: error: assignment from incompatible pointer type [-Werror=incompatible-pointer-types]
     kernel_datapath->set_rate_abs = &do_set_rate_abs;
                                   ^
/ccp/ccp-kernel/tcp_ccp.c:429:20: error: ‘struct ccp_datapath’ has no member named ‘log’
     kernel_datapath->log = &ccp_log;
                    ^~
/ccp/ccp-kernel/tcp_ccp.c:431:20: error: passing argument 1 of ‘ccp_nl_sk’ from incompatible pointer type [-Werror=incompatible-pointer-types]
     ok = ccp_nl_sk(&ccp_read_msg);
                    ^
In file included from /ccp/ccp-kernel/tcp_ccp.c:14:0:
/ccp/ccp-kernel/ccp_nl.h:17:5: note: expected ‘ccp_nl_recv_handler {aka int (*)(struct ccp_datapath *, char *, int)}’ but argument is of type ‘int (*)(char *, int)’
 int ccp_nl_sk(ccp_nl_recv_handler msg);
     ^~~~~~~~~
/ccp/ccp-kernel/tcp_ccp.c:436:31: error: assignment from incompatible pointer type [-Werror=incompatible-pointer-types]
     kernel_datapath->send_msg = &nl_sendmsg;
                               ^
/ccp/ccp-kernel/tcp_ccp.c: In function ‘tcp_ccp_unregister’:
/ccp/ccp-kernel/tcp_ccp.c:467:26: error: ‘struct ccp_datapath’ has no member named ‘ccp_active_connections’
     kfree(kernel_datapath->ccp_active_connections);
                          ^~
cc1: some warnings being treated as errors
scripts/Makefile.build:316: recipe for target '/ccp/ccp-kernel/tcp_ccp.o' failed
make[2]: *** [/ccp/ccp-kernel/tcp_ccp.o] Error 1
Makefile:1550: recipe for target '_module_/ccp/ccp-kernel' failed
make[1]: *** [_module_/ccp/ccp-kernel] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-4.13.0-46-generic'
Makefile:36: recipe for target 'all' failed
make: *** [all] Error 2

From the errors, it seems there are some part of the code incompatible with each other.

Hope you can find some minutes on this, really appreciate your reply!

akshayknarayan commented 4 years ago

Yes, that appears to be the case. ccp-kernel relies on the libccp submodule to communicate with userspace. Could you try git submodule update --init --recursive?

Thanks for your patience :)

charles-typ commented 4 years ago

Thanks for the immediate reply!

The ccp-kernel succeed in compiling but by running the BBR example, the problem still exists.

vagrant@ubuntu1710:/ccp/bbr$ sudo ./target/release/bbr --ipc=netlink                                                                  Feb 18 01:29:32.962 INFO configured BBR, probe_rtt_interval: Duration { secs: 10, nanos: 0 }, ipc: netlink
Feb 18 01:29:32.963 INFO starting CCP, ipc: netlink, algorithm: bbr                                                                   Feb 18 01:29:35.535 INFO new_flow                                                                                                
Feb 18 01:29:35.535 INFO switching to PROBE_BW, min rtt (us): 24536, Rate (5/4): 7.060368, bottle rate (Mbps): 5.648296, Rate (3/4): 4.236216, cwnd: 34646
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("Cannot update field: \"Report.minrtt\"")', /ccp/bbr/src/lib.rs:132:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace  

I also printed out the backtrace, hope that helps:

vagrant@ubuntu1710:/ccp/bbr$ sudo RUST_BACKTRACE=1 ./target/release/bbr --ipc=netlink
Feb 18 01:39:56.612 INFO configured BBR, probe_rtt_interval: Duration { secs: 10, nanos: 0 }, ipc: netlink
Feb 18 01:39:56.614 INFO starting CCP, ipc: netlink, algorithm: bbr
Feb 18 01:40:08.612 INFO new_flow
Feb 18 01:40:08.612 INFO switching to PROBE_BW, min rtt (us): 26282, Rate (5/4): thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("Cannot update field: \"Report.minrtt\"")', 10.913496, /ccp/bbr/src/lib.rs:132:9
stack backtrace:
bottle rate (Mbps): 8.7308, Rate (3/4): 6.548096, cwnd: 57365
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/libunwind.rs:86
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:78
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:59
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1052
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1428
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:62
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:49
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:204
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:224
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:470
  11: rust_begin_unwind
             at src/libstd/panicking.rs:378
  12: core::panicking::panic_fmt
             at src/libcore/panicking.rs:85
  13: core::option::expect_none_failed
             at src/libcore/option.rs:1211
  14: ccp_bbr::Bbr<T>::install_probe_bw
  15: <ccp_bbr::Bbr<T> as portus::Flow>::on_report
  16: portus::run
  17: bbr::main
  18: std::rt::lang_start::{{closure}}
  19: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:52
  20: std::panicking::try::do_call
             at src/libstd/panicking.rs:303
  21: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:86
  22: std::panicking::try
             at src/libstd/panicking.rs:281
  23: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  24: std::rt::lang_start_internal
             at src/libstd/rt.rs:51
  25: main
  26: __libc_start_main
  27: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Also a verbose one:

vagrant@ubuntu1710:/ccp/bbr$ sudo RUST_BACKTRACE=full ./target/release/bbr --ipc=netlink
Feb 18 01:42:26.271 INFO configured BBR, probe_rtt_interval: Duration { secs: 10, nanos: 0 }, ipc: netlink
Feb 18 01:42:26.274 INFO starting CCP, ipc: netlink, algorithm: bbr
Feb 18 01:42:33.028 INFO new_flow
Feb 18 01:42:33.028thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("Cannot update field: \"Report.minrtt\"")',  INFO switching to PROBE_BW/ccp/bbr/src/lib.rs:132:9
stack backtrace:
, min rtt (us): 24098, Rate (5/4): 13.465464, bottle rate (Mbps): 10.772376, Rate (3/4): 8.07928, cwnd: 64898
   0:     0x558bea99c694 - backtrace::backtrace::libunwind::trace::hf701e1fe67353251
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/libunwind.rs:86
   1:     0x558bea99c694 - backtrace::backtrace::trace_unsynchronized::h67d44c0b9d74674f
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/mod.rs:66
   2:     0x558bea99c694 - std::sys_common::backtrace::_print_fmt::h2542830aaba9f388
                               at src/libstd/sys_common/backtrace.rs:78
   3:     0x558bea99c694 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::he38bdf65fa8bb4b8
                               at src/libstd/sys_common/backtrace.rs:59
   4:     0x558bea9be69c - core::fmt::write::h90aed719578296ee
                               at src/libcore/fmt/mod.rs:1052
   5:     0x558bea999e47 - std::io::Write::write_fmt::h3f2c4b55e7cb509d
                               at src/libstd/io/mod.rs:1428
   6:     0x558bea99edc5 - std::sys_common::backtrace::_print::h6250ecdc61f909fc
                               at src/libstd/sys_common/backtrace.rs:62
   7:     0x558bea99edc5 - std::sys_common::backtrace::print::h339d80780750ff89
                               at src/libstd/sys_common/backtrace.rs:49
   8:     0x558bea99edc5 - std::panicking::default_hook::{{closure}}::hb6175398809dfc97
                               at src/libstd/panicking.rs:204
   9:     0x558bea99eb06 - std::panicking::default_hook::h1aa1ecfdcb0163c6
                               at src/libstd/panicking.rs:224
  10:     0x558bea99f422 - std::panicking::rust_panic_with_hook::h9e0c581ae75353b2
                               at src/libstd/panicking.rs:470
  11:     0x558bea99f00b - rust_begin_unwind
                               at src/libstd/panicking.rs:378
  12:     0x558bea9bc5c1 - core::panicking::panic_fmt::he06a214bb0a20aa5
                               at src/libcore/panicking.rs:85
  13:     0x558bea9bc3e3 - core::option::expect_none_failed::h5907a4a47a09b743
                               at src/libcore/option.rs:1211
  14:     0x558bea8d8674 - ccp_bbr::Bbr<T>::install_probe_bw::h4a078e191ec5ec71
  15:     0x558bea8d628a - <ccp_bbr::Bbr<T> as portus::Flow>::on_report::hfca416854d8bc70b
  16:     0x558bea8c8ef3 - portus::run::h91bc2ec587ec1b59
  17:     0x558bea8cfb1c - bbr::main::h6d3ee7db621f431c
  18:     0x558bea8d00b3 - std::rt::lang_start::{{closure}}::hf655ff105abee568
  19:     0x558bea99eee3 - std::rt::lang_start_internal::{{closure}}::hea4ba1844515ce5c
                               at src/libstd/rt.rs:52
  20:     0x558bea99eee3 - std::panicking::try::do_call::hcc8e372ded674185
                               at src/libstd/panicking.rs:303
  21:     0x558bea9a2fc7 - __rust_maybe_catch_panic
                               at src/libpanic_unwind/lib.rs:86
  22:     0x558bea99f8c9 - std::panicking::try::hb7ae90427b8077c5
                               at src/libstd/panicking.rs:281
  23:     0x558bea99f8c9 - std::panic::catch_unwind::hdc1b2c66100e9b20
                               at src/libstd/panic.rs:394
  24:     0x558bea99f8c9 - std::rt::lang_start_internal::hb9488e5a79b5a928
                               at src/libstd/rt.rs:51
  25:     0x558bea8d00a2 - main
  26:     0x7f15b53db1c1 - __libc_start_main
  27:     0x558bea8bdfea - _start
  28:                0x0 - <unknown>
akshayknarayan commented 4 years ago

It seems I am unable to reproduce this:

➜ sudo ./target/release/bbr --ipc=netlink
Feb 17 21:59:30.873 INFO configured BBR, probe_rtt_interval: Duration { secs: 10, nanos: 0 }, ipc: netlink
Feb 17 21:59:30.873 INFO starting CCP, ipc: netlink, algorithm: bbr
Feb 17 22:01:46.408 INFO probe_bw, bottle rate (Mbps): 1, rate (Mbps): 146997491871598.5, elapsed: 0.046
Feb 17 22:01:46.409 INFO new min_rtt, bottle rate: 1, min_rtt (us): 22989
Feb 17 22:01:46.409 INFO new_flow
Feb 17 22:01:46.409 INFO switching to PROBE_BW, min rtt (us): 22989, Rate (5/4): 0, bottle rate (Mbps): 146997491871598.5, Rate (3/4): 0, cwnd: 2336627200
Feb 17 22:01:46.432 INFO probe_bw, bottle rate (Mbps): 146997491871598.5, rate (Mbps): 146997491871598.5, elapsed: 0.07
Feb 17 22:01:46.433 INFO new min_rtt, bottle rate: 146997491871598.5, min_rtt (us): 22201
Feb 17 22:01:46.456 INFO probe_bw, bottle rate (Mbps): 146997491871598.5, rate (Mbps): 146997491871598.5, elapsed: 0.094

I ran the latest ccp-kernel commit with the following commands:

➜ mm-delay 10  mm-link mm-traces/bw48.mahi mm-traces/bw48.mahi --uplink-queue="droptail" --downlink-queue="droptail" --uplink-queue-args="packets=160" --downlink-queue-args="packets=160"
➜ iperf -c $MAHIMAHI_BASE -p 5000 -t 30 -i 1 -Z ccp

Could you perhaps try the following:

  1. re-cloning a fresh version of bbr and compiling it on stable rust (i.e., using rustup default stable) (the latest commit adds support for stable rust)
  2. Seeing if you can reproduce this issue on Ubuntu 18.04. tcp_probe won't work for reproducibility, however.
charles-typ commented 4 years ago

Thank you so much! It's working now.

Another quick thing to ask: after I run cargo build, there is only a debug folder in target/, how can I get the release/ folder?

Apologies that I'm totally new to rust

akshayknarayan commented 4 years ago

Great!

target/debug corresponds to building in debug mode, analogous to -g -O0 in gcc. To build in release mode, you can use cargo build --release, which will create binaries in target/release.

You can find more information here: https://doc.rust-lang.org/book/ch01-03-hello-cargo.html#building-for-release