SmallLonlyWolf / embox

Automatically exported from code.google.com/p/embox
0 stars 0 forks source link

smp possible deadlock when typing (no-kvm) #683

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.r12805, x86/zrv-smp-security
2.make it no-graphic template (patch mods.config):
Index: templates/x86/zrv-smp-security/mods.config
===================================================================
--- templates/x86/zrv-smp-security/mods.config  (revision 12805)
+++ templates/x86/zrv-smp-security/mods.config  (working copy)
@@ -42,9 +42,7 @@
    @Runlevel(2) include embox.driver.clock.pit(irq_num=2)
    @Runlevel(2) include embox.driver.terminal
    @Runlevel(2) include embox.driver.net.e1000
-   @Runlevel(2) include embox.driver.console.vc.vc
-   /*@Runlevel(2) include 
embox.driver.diag(impl="embox__driver__serial__i8250")*/
-   @Runlevel(2) include embox.driver.diag(impl="embox__driver__console__vc__vga")
+   @Runlevel(2) include embox.driver.diag(impl="embox__driver__serial__i8250")
    @Runlevel(2) include embox.driver.serial.i8250(baud_rate=38400)
    @Runlevel(2) include embox.driver.net.loopback
    @Runlevel(2) include embox.driver.virtual.null
@@ -71,8 +69,6 @@
    @Runlevel(2) include embox.fs.driver.initfs
    @Runlevel(2) include embox.fs.driver.nfs
    @Runlevel(2) include embox.fs.driver.tmpfs
-   @Runlevel(2) include embox.fs.driver.ext3
-   @Runlevel(2) include embox.fs.driver.ext4
    /*@Runlevel(2) include embox.fs.driver.cifs*/
    @Runlevel(2) include embox.fs.driver.ramfs
    @Runlevel(2) include embox.fs.driver.ffs
@@ -82,7 +78,7 @@
    include embox.compat.posix.util.utsname(system="zrv",hostname="zrv-host",release="0.1")
    /*@Runlevel(3) include embox.cmd.shell(prompt="ZaryaRV>")*/
    @Runlevel(3) include embox.cmd.sh.tish(prompt="ZaryaRV>", rich_prompt_support=0,builtin_commands="cd export smac_adm")
-   @Runlevel(3) include 
embox.init.start_script(shell_name="tish",tty_dev="vc",shell_start=1)
+   @Runlevel(3) include 
embox.init.start_script(shell_name="tish",tty_dev="ttyS0",shell_start=1)

    include embox.cmd.mpstat
    include embox.cmd.proc.kill
3.type long string (over >80 characters) and remove it (using backspace) 2-5 
times

What is the expected output? What do you see instead?
$ ./scripts/qemu/auto_qemu -smp 2 -no-kvm
> smac_adm -R service       -o default   -a rwx 
> smac_adm -R unclassified      -o default   -a rwx 
> smac_adm -R _         -o default   -a rwx 
> login 
login: adsasdasdadasdasdasdasdasdasdasdasdasQEMU: Terminated

Please use labels and text to provide additional information.
gcc 4.8.1, qemu 1.7.50

Original issue reported on code.google.com by ki.stfu on 26 Feb 2014 at 12:33

GoogleCodeExporter commented 9 years ago
or so:

> smac_adm -R unclassified      -o default   -a rwx 
> smac_adm -R _         -o default   -a rwx 
> login 
login: aksdfjlasjdflkajsdlfjasldfkladjsflkjasdf
password: 
login: kljasdflkdjaslfjaslfkjlasdfjlasjfldasjfldkasjflasdjflkjasdf
password: 
login: aldsfjdkasjfldkjasflkdasjflkdjasflkdajsflkajsdlfkjasdlfkjalsdkfdasf
password: 
login: asjdfkljaslfjdladksQEMU: Terminated

Original comment by ki.stfu on 26 Feb 2014 at 12:33

GoogleCodeExporter commented 9 years ago
Could you check with r12644 reverted?

Original comment by drakon.m...@gmail.com on 27 Feb 2014 at 7:32

GoogleCodeExporter commented 9 years ago
when I run it with auto_qemu it prints
$ sudo qemu-system-i386 -enable-kvm -kernel ../../build/base/bin/embox  -m 128 
-net nic,model=e1000,macaddr=AA:BB:CC:DD:EE:02 -net 
tap,name=tap0,script=start_script,downscript=stop_script -nographic -no-kvm 
-smp 2
and it's realy stop working

I delete first -enable-kvm from args and it's working. Ack if it so for you too.

Original comment by drakon.m...@gmail.com on 27 Feb 2014 at 10:07

GoogleCodeExporter commented 9 years ago
Test without any changes:

issue683 $ ./scripts/qemu/auto_qemu -smp 2 -no-kvm
$ sudo qemu-system-i386 -enable-kvm -kernel ../../build/base/bin/embox  -m 128 
-net nic,model=e1000,macaddr=AA:BB:CC:DD:EE:02 -net 
tap,name=tap0,script=start_script,downscript=stop_script -nographic -smp 2 
-no-kvm
ioctl(TUNSETIFF): Device or resource busy
Enable IP Forwarding for wlan0
net.ipv4.ip_forward = 1

Embox kernel start
runlevel: init level is 0
    unit: initializing embox.driver.interrupt.ioapic: done
    unit: initializing embox.driver.clock.pit: done
    unit: initializing embox.kernel.time.jiffies: done
    unit: initializing embox.kernel.time.timer: done
    unit: initializing embox.kernel.time.kernel_time: done
    unit: initializing embox.kernel.task.kernel_task: done
    unit: initializing embox.mem.static_heap: done
    unit: initializing embox.mem.heap_bm: done
    unit: initializing embox.kernel.thread.core: done
    unit: initializing embox.mem.phymem: start=0x05127000, end=0x08100000, size=50171904
done
    unit: initializing embox.fs.buffer_cache: done
    unit: initializing embox.driver.block: done
    unit: initializing embox.driver.ide: done
runlevel: init level is 1
    unit: initializing embox.arch.x86.kernel.smp: done
    unit: initializing embox.kernel.task.multi: done
    unit: initializing embox.driver.pci:    unit: initializing embox.net.dev: done
    unit: initializing embox.net.net_entry: done
    pci: loading embox.driver.net.e1000: done
done
    unit: initializing embox.fs.node: done
    unit: initializing embox.fs.driver.repo: done
    unit: initializing embox.security.smac: done
    unit: initializing embox.fs.rootfs: initfs_mount: unpack initinitfs at 0x001bb8f4 into /
done
    unit: initializing embox.kernel.work: done
    unit: initializing embox.driver.serial.i8250: done
    unit: initializing embox.driver.net.loopback: done
    unit: initializing embox.net.tcp: done
    unit: initializing embox.fs.driver.tmpfs: done
    unit: initializing embox.fs.driver.ramfs: done
    unit: initializing embox.net.neighbour: done
    unit: initializing embox.kernel.time.timekeeper: done
    unit: initializing embox.mem.slab: done
    unit: initializing embox.init.diag_index_desc: done
runlevel: init level is 2
    unit: initializing embox.init.start_script: 
Started shell [tish] on device [ttyS0]
loading start script:
> ifconfig lo 127.0.0.1 netmask 255.0.0.0 up 
> route add 127.0.0.0 netmask 255.0.0.0 lo 
> ifconfig eth0 10.0.2.16 netmask 255.255.255.0 hw ether AA:BB:CC:DD:EE:02 up 
> route add 10.0.2.0 netmask 255.255.255.0 eth0 
> route add default gw 10.0.2.10 eth0 
> mkdir /mandatory_test 
> mount -t ext2 /dev/hda /mandatory_test 
mount: Command returned with code -19: No such device
> smac_adm -R high_label -o low_label  -a r 
> smac_adm -R low_label  -o high_label -a w 
> smac_adm -R _     -o smac_admin -a rw 
> smac_adm -R confidentially -o unclassified  -a r 
> smac_adm -R confidentially -o service  -a r 
> smac_adm -R confidentially -o secret  -a r 
> smac_adm -R secret -o unclassified  -a r 
> smac_adm -R secret -o service  -a r 
> smac_adm -R service -o unclassified  -a r 
> smac_adm -R high_label        -o default   -a rwx 
> smac_adm -R low_label     -o default   -a rwx 
> smac_adm -R confidentially    -o default   -a rwx 
> smac_adm -R secret        -o default   -a rwx 
> smac_adm -R service       -o default   -a rwx 
> smac_adm -R unclassified      -o default   -a rwx 
> smac_adm -R _         -o default   -a rwx 
> login 
login: asdasfadasfsdasasdsdQEMU: Terminated
ioctl(TUNSETIFF): Device or resource busy
stop_script: could not launch network script

Original comment by ki.stfu on 27 Feb 2014 at 11:06

GoogleCodeExporter commented 9 years ago
Delete first -enable-kvm from arguments:

issue683 $ ./scripts/qemu/auto_qemu -smp 2 -no-kvm
$ sudo qemu-system-i386  -kernel ../../build/base/bin/embox  -m 128 -net 
nic,model=e1000,macaddr=AA:BB:CC:DD:EE:02 -net 
tap,name=tap0,script=start_script,downscript=stop_script -nographic -smp 2 
-no-kvm
ioctl(TUNSETIFF): Device or resource busy
Enable IP Forwarding for wlan0
net.ipv4.ip_forward = 1

Embox kernel start
runlevel: init level is 0
    unit: initializing embox.driver.interrupt.ioapic: done
    unit: initializing embox.driver.clock.pit: done
    unit: initializing embox.kernel.time.jiffies: done
    unit: initializing embox.kernel.time.timer: done
    unit: initializing embox.kernel.time.kernel_time: done
    unit: initializing embox.kernel.task.kernel_task: done
    unit: initializing embox.mem.static_heap: done
    unit: initializing embox.mem.heap_bm: done
    unit: initializing embox.kernel.thread.core: done
    unit: initializing embox.mem.phymem: start=0x05127000, end=0x08100000, size=50171904
done
    unit: initializing embox.fs.buffer_cache: done
    unit: initializing embox.driver.block: done
    unit: initializing embox.driver.ide: done
runlevel: init level is 1
    unit: initializing embox.arch.x86.kernel.smp: done
    unit: initializing embox.kernel.task.multi: done
    unit: initializing embox.driver.pci:    unit: initializing embox.net.dev: done
    unit: initializing embox.net.net_entry: done
    pci: loading embox.driver.net.e1000: done
done
    unit: initializing embox.fs.node: done
    unit: initializing embox.fs.driver.repo: done
    unit: initializing embox.security.smac: done
    unit: initializing embox.fs.rootfs: initfs_mount: unpack initinitfs at 0x001bb8f4 into /
done
    unit: initializing embox.kernel.work: done
    unit: initializing embox.driver.serial.i8250: done
    unit: initializing embox.driver.net.loopback: done
    unit: initializing embox.net.tcp: done
    unit: initializing embox.fs.driver.tmpfs: done
    unit: initializing embox.fs.driver.ramfs: done
    unit: initializing embox.net.neighbour: done
    unit: initializing embox.kernel.time.timekeeper: done
    unit: initializing embox.mem.slab: done
    unit: initializing embox.init.diag_index_desc: done
runlevel: init level is 2
    unit: initializing embox.init.start_script: 
Started shell [tish] on device [ttyS0]
loading start script:
> ifconfig lo 127.0.0.1 netmask 255.0.0.0 up 
> route add 127.0.0.0 netmask 255.0.0.0 lo 
> ifconfig eth0 10.0.2.16 netmask 255.255.255.0 hw ether AA:BB:CC:DD:EE:02 up 
> route add 10.0.2.0 netmask 255.255.255.0 eth0 
> route add default gw 10.0.2.10 eth0 
> mkdir /mandatory_test 
> mount -t ext2 /dev/hda /mandatory_test 
mount: Command returned with code -19: No such device
> smac_adm -R high_label -o low_label  -a r 
> smac_adm -R low_label  -o high_label -a w 
> smac_adm -R _     -o smac_admin -a rw 
> smac_adm -R confidentially -o unclassified  -a r 
> smac_adm -R confidentially -o service  -a r 
> smac_adm -R confidentially -o secret  -a r 
> smac_adm -R secret -o unclassified  -a r 
> smac_adm -R secret -o service  -a r 
> smac_adm -R service -o unclassified  -a r 
> smac_adm -R high_label        -o default   -a rwx 
> smac_adm -R low_label     -o default   -a rwx 
> smac_adm -R confidentially    -o default   -a rwx 
> smac_adm -R secret        -o default   -a rwx 
> smac_adm -R service       -o default   -a rwx 
> smac_adm -R unclassified      -o default   -a rwx 
> smac_adm -R _         -o default   -a rwx 
> login 
login: asdaskdjasasadaaasasaasdaQEMU: Terminated
ioctl(TUNSETIFF): Device or resource busy
stop_script: could not launch network script

Original comment by ki.stfu on 27 Feb 2014 at 11:07

GoogleCodeExporter commented 9 years ago
With changes made in r12644:
> login 
login: qkl;wejqkl;wjel;kqwjelkqwjelqw;elkqwj
password: 
login: jaskl;djlkwjqlekjaklsdnasdjqwneqjkwndjkasndkjasndkanskdjaskjdnaskjdnaskj
password: 
login: dqwkjedasjdlkajwqjenajdnjasndjqwnjkeQEMU: Terminated

> login 
login: aklsdjaskljdqwkjekljqwdkjasldjaskljd
password: 
login: qjkwenqwjkenajkdnQEMU: Terminated <-- deadlock

> login 
login: askjdaklsjdkljwqkledklajdlkjasldkjas;kldjaslkdj
password: 
login: aksdkasdkamsaskdmQEMU: Terminated <-- deadlock

> login 
login: askljdaskjdaksdkalsjdkasd;klasjd;kasj
password: 
login: aksdmaskldmqwlkQEMU: Terminated <-- deadlock

-----------------------------------------------------------------------
Without changes made in r12644:

> login 
login: kladsjfkasdjfklasjdfkasjdfkasdfl;kasdkflasdl;fkajsdlfkj
password: 
login: awklerj;klwejrlkjafksdlkfjal;skjfl;sdjflasdkjfl;askjdfl;asdjf
password: 
login: a;ksdfj;laksjdflkjwereqwjnjknsdfkajsnfkjqnwkjnwkjenkajsndkjasnd
password: 
login: a;skdjfaskdfkjqwjnejnafjknasdlfjnasdfnasdlkfjasndfkljasndlfkjasn
password: 
login: jklansdfkjnaskjdfnkqjwnekjnkjandjknsdaskdmqwklmelkmlkadmlkasmdlka
password: 
login: jqwnekjandkjansjklnwqiueiadyauishdjqwnedjasnjdnaskjdasjkdnqwjkdkaj
password: 
login: kdqjwldjkjaslkdjalskdjlqwkdasjdnkjwnqkjenaksjndajksndkjasndkansdja
password: 
login: qwkkladansjdnqwkjeuasdyausidnjqwdnajsdnasjkndajsdjkasndkjansdkja
password: 
login: 
aksdjaskljdklqwjdklajsdashdqwhbdhqwbdjbasjdhasjdbajbdjashbdjasbdjasbh
password: 
login: ajshdasjkhdkjqwjdknasdashdbqhwbdhabsjdhbjhdbajshdbjashbdjahbd
password: 
login: ajsdnasjkndjkqwndjknasjkdnasjdnqkwjndjkqwnkdjanskdnaskjd
password: 
login: ajksdnaksjndjkwqjedqwedashduiqwgdasdhgqhwegqwjhgdajsgdasdas
password: 
login: jdhasjdhqjwdajsdhabsdhbqwhdbqwhjbdqwjhbdqjwhdbqjwdbqwjhdbkqwjh
password: QEMU: Terminated <-- All ok, I think it works

Original comment by ki.stfu on 27 Feb 2014 at 11:21

GoogleCodeExporter commented 9 years ago
Alright, answer is in r12644, lets consider it.

What is really changed that at the end we do ipl_restore instead of ipl_enable. 
If this happens only in smp, then we are setting ipl level of old cpu to new 
one. What else could it be?

Original comment by drakon.m...@gmail.com on 27 Feb 2014 at 12:19

GoogleCodeExporter commented 9 years ago
I can't reproduce this, suggest someone else to fix it

Original comment by drakon.m...@gmail.com on 28 Feb 2014 at 2:16

GoogleCodeExporter commented 9 years ago
Seems that r12644 someway broke __sched_wakeup_smp_inactive. 
Place a breakpoint on debug_excpt introduced in patch.

After debug_excpt fired the thread of interest is sleeping forever with 
active=1, ready=0, waiting=TW_SMP_WAKING. You may call whereami() from gdb to 
check it.

Patch introduces memory barriers implementation. Not sure if lack of barriers 
is the reason or something else

Original comment by drakon.m...@gmail.com on 28 Feb 2014 at 4:04

Attachments:

GoogleCodeExporter commented 9 years ago
Check this

Original comment by drakon.m...@gmail.com on 28 Feb 2014 at 4:43

Attachments:

GoogleCodeExporter commented 9 years ago
yeah, it is. it was introduced by r12010

Original comment by ki.stfu on 1 Mar 2014 at 7:34

GoogleCodeExporter commented 9 years ago
fixed in r12873

Original comment by Vita.Loginova on 3 Mar 2014 at 8:47