Closed GoogleCodeExporter closed 9 years ago
2339 loading start script:
2340 > ifconfig lo 127.0.0.1 netmask 255.0.0.0 up
2341 t->lock 0x00554008 -- &lock in thread_init
2342 wa->lock 0x00554918 -- &lock in waitq_init
2343 wa->lock 0x0055426c -- &lock in waitq_init
2344 trylock 0x00554008 -- thread lock
2345 trylock 0x0024ab3c -- runq lock
2346 trylock 0x0024ab3c -- runq lock
2347 trylock 0x002360bc -- start_script waitq lock
2348 trylock 0x00554918 -- waitq lock (see 2342)
2349 trylock 0x00554008 -- thread lock
2350 trylock 0x0024ab3c -- runq lock
2351 trylock 0x0024ab3c -- runq lock (usually it's the last trylock
operation in this thread and after that the next start_script command starts)
2352 trylock 0xf000fe7e -- ??? <- assert there
2353 trylock 0xf000fe7e -- ??? (duplicate of 2352)
2354 trylock 0x00238f34 -- ???
Original comment by ki.stfu
on 22 Jan 2014 at 10:51
Attachments:
That looks weird. I'll try to investigate it soon hopefully.
Original comment by Eldar.Abusalimov
on 23 Jan 2014 at 3:35
I think this is due to thread_terminate/sched_freeze/task_exit/thread_exit.
Some places seem wrong, for example:
__sched_freeze:
t->active = false;
t->waiting = false;
thread_exit:
current->waiting = true
Original comment by ki.stfu
on 23 Jan 2014 at 3:50
> export PWD=/
> export HOME=/
root@embox:/#
root@embox:/#
root@embox:/#mount -t nfs 10.0.2.10:/home/hak1r/data /tmp
10.0.2.10: RPC: Remote system error - Connection timed out
tish: mount: Command returned with code -1: Unknown error code. PLEASE REPORT
ISSUE 519!
root@embox:/#
root@embox:/#
root@embox:/#mount -t nfs 10.0.2.10:/home/hak1r/data /tmp
root@embox:/#umount /tmp
root@embox:/#mount -t nfs 10.0.2.10:/home/hak1r/data /tmp
root@embox:/#umount /tmp
root@embox:/#mount -t nfs 10.0.2.10:/home/hak1r/data /tmp
root@embox:/#umount /tmp
______
| ____| __ __
| |___ _ __ ___ ____ ____ ____ _____/ / _____ / /
| ___|| '_ ` _ \ / __ \/ __ \/ __ \/ ___/ / |_____| |
| |____| | | | | |_ _ _ / /_/ / /_/ / /_/ (__ )_/ |_____| |
|______|_| |_| |_(_|_|_) \____/\____/ .___/____(_) | |
/_/ \_\
ASSERTION FAILED on CPU 0
at ./src/include/kernel/spinlock.h:92
in function __spin_trylock
lock->owner != cpu_id
(Recursive lock of a spin owned by this CPU)
-- 01000000 * A R thread 1 task 0 --------------------------------------
run 0x0010b924 <boot_stub+0x0> src/kernel/thread/boot_thread.c:20
-- 01000000 R thread 0 task 0 --------------------------------------
1 0x0013f6df <__schedule+0x14b> src/kernel/sched/sched.c:379
run 0x0010b8a8 <idle_run+0x0> src/kernel/thread/idle_thread.c:21
Original comment by ki.stfu
on 31 Jan 2014 at 1:21
Is there a backtrace available? At least where is the spin_lock() called from?
Original comment by Eldar.Abusalimov
on 31 Jan 2014 at 1:26
32 void waitq_add(struct waitq *wq, struct waitq_link *wql) {
33 ipl_t ipl;
34
35 assert(wq && wql);
36
37 ipl = spin_lock_ipl(&wq->lock); <-- there
38 __waitq_add(wq, wql);
39 spin_unlock_ipl(&wq->lock, ipl);
40 }
41
Original comment by ki.stfu
on 1 Feb 2014 at 3:28
Attachments:
17 int task_waitpid(pid_t pid) {
18 struct task *task, *parent;
19 struct waitq_link wql;
20 int ret = 0;
21
22 waitq_link_init(&wql);
23
24 sched_lock();
25 {
26 task = task_table_get(pid);
28 parent = task->parent;
29 if (!task) {
30 ret = -ECHILD;
31 goto out;
32 }
33
34 waitq_wait_prepare(task->waitq, &wql);<--- there
35
36 sched_wait();
37
38 ret = parent->child_err;
39
40 /* no cleanup since task is dead already */
41 }
42 out:
43 sched_unlock();
44
45 return ret;
46 }
47
Original comment by ki.stfu
on 1 Feb 2014 at 3:43
Attachments:
Here occurs something strange. After checking task for NULL it doesn't jump to
out..
17 int task_waitpid(pid_t pid) {
18 struct task *task, *parent;
19 struct waitq_link wql;
20 int ret = 0;
21
22 waitq_link_init(&wql);
23
24 sched_lock();
25 {
26 task = task_table_get(pid);
27 printk("waitpid %p %u\n", task, pid);
28 parent = task->parent;
29 if (!task) {
30 ret = -ECHILD;
31 goto out;
32 }
33
34 printk("waitpid prepare wq %p\n", task->waitq);
35 waitq_wait_prepare(task->waitq, &wql,423);
36
37 sched_wait();
38
39 ret = parent->child_err;
40
41 /* no cleanup since task is dead already */
42 }
43 out:
44 sched_unlock();
45
46 return ret;
47 }
> route add 127.0.0.0 netmask 255.0.0.0 lo
t->lock 0x00550008
wa->lock 0x00550918
wa->lock 0x0055026c
trylock 0x00550008
trylock 0x0024ab3c
trylock 0x0024ab3c
trylock 0x002360bc
trylock 0x00550918
trylock 0x00550008
trylock 0x0024ab3c
trylock 0x0024ab3c
waitpid 0x00000000 1
waitpid prepare wq 0xf000fe6e <--- jump was before it
trylock 0xf000fe7e
place 423
QEMU: Terminated
Original comment by ki.stfu
on 1 Feb 2014 at 3:45
Attachments:
It's gcc optimization. After line 28 gcc assume that task not NULL, so the next
IF statement is not checked and is considered as FALSE.
Original comment by ki.stfu
on 1 Feb 2014 at 4:07
man gcc:
-fdelete-null-pointer-checks
Assume that programs cannot safely dereference null
pointers, and that no code or data element resides there.
This enables simple constant folding optimizations at all
optimization levels. In addition, other optimization
passes in GCC use this flag to control global dataflow
analyses that eliminate useless checks for null pointers;
these assume that if a pointer is checked after it has
already been dereferenced, it cannot be null.
Original comment by ki.stfu
on 1 Feb 2014 at 4:16
[deleted comment]
fixed by r12543
Original comment by ki.stfu
on 1 Feb 2014 at 4:20
Original issue reported on code.google.com by
ki.stfu
on 19 Jan 2014 at 12:48Attachments: