libos-nuse / linux-libos-tools

userspace tools for linux libos
GNU General Public License v2.0
95 stars 20 forks source link

Problem with using NUSE with netmap #15

Open thehajime opened 9 years ago

thehajime commented 9 years ago

From @slfmessi on May 4, 2015 10:37

Hi, I am doing something with LibOS. And I am facing some problems which is I can't use it successfully with netmap. When I want to test NUSE, I always meet a core dump after it shows the vif log just like this

create vif eth0
  address = 192.168.108.136
  netmask = 255.255.255.0
  macaddr = 00:0c:29:70:93:12
  type    = 1
Segmentation fault (core dumped)

My test conf as this

interface eth0
    address 192.168.108.136
    netmask 255.255.255.0
    macaddr 00:0c:29:70:93:12
#   if macaddr is not specified, random mac addr is used.
    viftype NETMAP

route
    network 192.168.108.0
    netmask 255.255.255.0
    gateway 192.168.108.2

While, I have already loaded the module netmap.ko and its driver e1000.ko

sunlifei@sunlifei-VPC:~/Documents/net-next-nuse/arch/lib/tools$ lsmod | grep netmap
netmap                114576  1 e1000

I'm confused what was wrong with my environment. Should it be something about the kenel version where I use

sunlifei@sunlifei-VPC:~/Documents/net-next-nuse/arch/lib/tools$ uname -a
Linux sunlifei-VPC 3.13.0-45-generic #1 SMP Mon Apr 27 22:00:27 CST 2015 x86_64 x86_64 x86_64 GNU/Linux

,would this matter?

Copied from original issue: libos-nuse/net-next-nuse#43

thehajime commented 9 years ago

could you please provide a stack trace by gdb with your dumped core?

% gdb -c core (your application binary) (gdb) bt

should give a stack trace.

thehajime commented 9 years ago

From @slfmessi on May 4, 2015 11:35

@thehajime Well, I have done something like that. The application I use to test is

ping
, with the command

sudo NUSECONF=nuse.conf ./nuse ping <my gateway>

The problem is that I can't see the function in stack trace , like that

sunlifei@sunlifei-VPC:~/Documents/net-next-nuse/arch/lib/tools$ sudo gdb -c core.74460 ping 192.168.108.133
Excess command line arguments ignored. (192.168.108.133)
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ping...(no debugging symbols found)...done.
warning: core file may not match specified executable file.
[New LWP 74460]
[New LWP 74463]
[New LWP 74464]
[New LWP 74461]
[New LWP 74462]
[New LWP 74465]
[New LWP 74466]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `ping 192.168.108.133'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f5763c88bee in nuse_vif_create (type=NUSE_VIF_NETMAP, 
    ifname=0x13708a0 "eth0") at nuse-vif.c:69
69      return impl->create(ifname);
(gdb) bt
#0  0x00007f5763c88bee in nuse_vif_create (type=NUSE_VIF_NETMAP, 
    ifname=0x13708a0 "eth0") at nuse-vif.c:69
#1  0x00007f5763c8e35a in nuse_netdev_create (vifcf=0x13708a0) at nuse.c:367
#2  0x00007f5763c8eb6a in nuse_init () at nuse.c:524
#3  0x00007f57646f513a in call_init (l=, argc=argc@entry=2, 
    argv=argv@entry=0x7fffb61640d8, env=env@entry=0x7fffb61640f0)
    at dl-init.c:78
#4  0x00007f57646f5223 in call_init (env=, 
    argv=, argc=, l=)
    at dl-init.c:36
#5  _dl_init (main_map=0x7f57649091c8, argc=2, argv=0x7fffb61640d8, 
    env=0x7fffb61640f0) at dl-init.c:126
#6  0x00007f57646e630a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#7  0x0000000000000002 in ?? ()
#8  0x00007fffb61657e6 in ?? ()
#9  0x00007fffb61657eb in ?? ()
#10 0x0000000000000000 in ?? ()
(gdb) 
thehajime commented 9 years ago

please try to pull the latest linux-libos-tools: the commit fced07aa92de779ebadc987ef599ccc69568ab29 should fix this issue.

slfmessi commented 9 years ago

Actually after pulled the latest repo, I can run the ping test program but without any packet transmitted. The program would wait after this

sunlifei@sunlifei-VPC:~/Documents/net-next-nuse/arch/lib/tools$ sudo NUSECONF=nuse.conf ./nuse ping 192.168.108.2
[sudo] password for sunlifei: 
<5>Linux version 4.0.0+ (sunlifei@sunlifei-VPC) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #0 Wed Apr 29 01:14:15 CST 2015
<6>NET: Registered protocol family 16
<6>NET: Registered protocol family 2
<6>TCP established hash table entries: 512 (order: 0, 4096 bytes)
<6>TCP bind hash table entries: 512 (order: 0, 4096 bytes)
<6>TCP: Hash tables configured (established 512 bind 512)
<6>UDP hash table entries: 128 (order: 0, 4096 bytes)
<6>UDP-Lite hash table entries: 128 (order: 0, 4096 bytes)
<6>NET: Registered protocol family 1
<6>Netfilter messages via NETLINK v0.30.
<6>nfnl_acct: registering with nfnetlink.
<6>nf_conntrack version 0.5.0 (32 buckets, 128 max)
<6>nf_tables: (c) 2007-2009 Patrick McHardy 
<6>ip_set: protocol 6
<6>ipip: IPv4 over IPv4 tunneling driver
<6>nsc: GRE over IPv4 demultiplexor driver
<6>nsc: GRE over IPv4 tunneling driver
<6>nsc: (C) 2000-2006 Netfilter Core Team
<6>Initializing XFRM netlink socket
<6>NET: Registered protocol family 10
<6>nsc: Mobile IPv6
<6>nsc: IPv6 over IPv4 tunneling driver
<6>NET: Registered protocol family 17
<6>NET: Registered protocol family 15
<6>DCCP: Activated CCID 2 (TCP-like)
<6>DCCP: Activated CCID 3 (TCP-Friendly Rate Control)
<6>nsc: Hash tables configured (established 512 bind 512)
create vif eth0
  address = 192.168.108.136
  netmask = 255.255.255.0
  macaddr = 00:00:00:00:00:00
  type    = 1
mac address for eth0 is randomized 02:00:1e:53:a5:42
nuse syscall proxy start at unix:///tmp/rump-server-nuse.4922
PING 192.168.108.2 (192.168.108.2) 56(84) bytes of data.

Use ctrl-c to abort it shows

PING 192.168.108.2 (192.168.108.2) 56(84) bytes of data.
^C
--- 192.168.108.2 ping statistics ---
50 packets transmitted, 0 received, 100% packet loss, time 49187ms
pipe 4
finishing NUSE
rump_server finishing.

Would this be a problem somewhere NUSE use netmap?

I also tried to debug it.

root       5139  0.0  0.0  71240  2184 pts/1    S+   09:29   0:00 sudo NUSECONF=nuse.conf ./nuse ping 192.1
root       5140  0.0  0.0   4444   656 pts/1    S+   09:29   0:00 /bin/sh ./nuse ping 192.168.108.2
root       5148 18.5  0.1 688860  4172 pts/1    Sl+  09:29   0:00 ping 192.168.108.2
root       5156  0.0  0.0      0     0 ?        S    09:29   0:00 [kworker/0:0]
sunlifei   5161  0.0  0.0  22644  1324 pts/6    R+   09:29   0:00 ps -aux

I use gdb to attach pid 5148, and use bt to show stack

sunlifei@sunlifei-VPC:~$ sudo gdb
[sudo] password for sunlifei: 
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb) attach 5148
Attaching to process 5148
Reading symbols from /bin/ping...(no debugging symbols found)...done.
warning: Could not load shared library symbols for 3 libraries, e.g. ../../../liblinux.so.
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
Reading symbols from /lib/x86_64-linux-gnu/libcap.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libcap.so.2
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libdl-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libdl.so.2
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libpthread-2.19.so...done.
done.
[New LWP 5158]
[New LWP 5157]
[New LWP 5154]
[New LWP 5153]
[New LWP 5152]
[New LWP 5151]
[New LWP 5150]
[New LWP 5149]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Loaded symbols for /lib/x86_64-linux-gnu/libpthread.so.0
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/librt-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/librt.so.1
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.19.so...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
185 ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such file or directory.
(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f54fdc12e5d in ?? ()
#2  0x0000000000b0a7a0 in ?? ()
#3  0x0000000000b0ae60 in ?? ()
#4  0x00007fff8564ee40 in ?? ()
#5  0x0000000000b0ae60 in ?? ()
#6  0x00007fff8564e570 in ?? ()
#7  0x00007f54fdc187c6 in ?? ()
#8  0x0000000000000000 in ?? ()
(gdb) 

It seems like the porblem was because there is no pthread_cond_wait.S ...

thehajime commented 9 years ago

I have no netmap machine right now. will investigate it later.

slfmessi commented 9 years ago

Actually, I need to hurry up my work since there is little time. So do you have any ideas about this problem, it will be nice if you can give me some guidance and I am glad to do my best in this project. It will be great if you can check it yourself, maybe you can install the environment in virtual machines...