lab11 / raspberrypi-cc2520

Code, hardware, and instructions to use the TI CC2520 with the Raspberry Pi.
31 stars 11 forks source link

Border Router App Freezes #5

Closed bradjc closed 11 years ago

bradjc commented 11 years ago

After adding in a condition variable to give sleep() something to wait on, the border router app now works for a while and then stays in locks up and stops running. I'm guessing it's in deadlock somewhere, but I'm not sure what is causing it.

anroOfCode commented 11 years ago

I took a quick look through that change set.

Alarms shouldn't post their events as a Task. That's (one of?) the big distinctions between them and the timer.

Timers are designed to give approximate timing guarantees, alarms are interrupt driven.

Alarms shouldn't wake up the main task loop, and posting alarm events might be causing the behavior you're seeing.

I think you should modify SchedulerBasicP.nc.

Notice how it provides TaskBasic? That's the interface that TinyOS uses when you write something like post someTask();. If you inject your code to wake the main thread up there we'll be good.

To be honest for people who created interfaces and abstractions for so many ridiculous edge cases they failed to consider this scenario very well. Most microcontrollers upon receiving an interrupt enter the wake state again. Ours doesn't and needs to signal to wake up.

bradjc commented 11 years ago

Still happening and always stops right after print_message() in ReceiveP.

It seems like the receive task isn't running, so the received packet is waiting for the buffer to free (which of course isn't) and everything is stopped. I'm guessing the timer isn't getting reset so that isn't waking up the main thread either.

I can't figure out how a task is getting on the queue without the main thread executing it. A quick fix I think is to put a timeout on the condition variable wait. But this shouldn't be necessary.

bradjc commented 11 years ago

It seems to get blocked on post receive_task() in ReceiveP. That makes it feel like the main thread is blocking in an atomic block and holding the atomic lock preventing the receive thread from being able to post to the queue. Adding a timeout to the condition variable wait didn't help, so I don't think the main thread is waiting on a condition variable that is never signalled.

I did a check with the busy loop as the sleep function and things seemed to work, so I think it's the addition of the sleep condition variable that throws it for the loop.

anroOfCode commented 11 years ago

Sorry I've been a little too busy to play around myself.

I think that this might be the problem:

  async command error_t TaskBasic.postTask[uint8_t id]()
  {
    bool ret;
    atomic {
      ret = pushTask(id);
    }
   // printf("task posted %i.\n", id);
    call ThreadWait.signalThread();
    return (ret) ? SUCCESS : EBUSY;
  }

Interacting with

 command void Scheduler.taskLoop()
  {
    for (;;)
    {
      uint8_t nextTask;

      atomic
      {
        while ((nextTask = popTask()) == NO_TASK)
        {
          call McuSleep.sleep();
        }
      }
      signal TaskBasic.runTask[nextTask]();
    }
  }

taskLoop is still has a lock on the atomic's thread_lock variables while it's waiting on that condition variable. postTask attempts to acquire that lock to enter its own atomic section to post the task and deadlock occurs.

The ThreadWait condition variable should be attached to the thread_lock mutex, not it's own mutex. This way the system will unlock the mutex while waiting on it, which is the desired behavior.

TinyOS does something similar with interrupts inside McuSleep I think.

bradjc commented 11 years ago

I thought this was the problem too, so I modified ThreadWait.wait() to "enable interrupts" before blocking on the cond variable and to disable them after, like the msp code.

bradjc commented 11 years ago

With the current method (using select) this can't happen.