Closed minhdb closed 2 years ago
Thanks for PR! I'll take a look at code and your comments/questions in upcoming days.
It seems new ctl tests hang on Linux. Also, there are some warnings and the code is not formatted.
BTW, you can run CI builds locally: https://roc-streaming.org/toolkit/docs/development/continuous_integration.html#run-locally
Though I didn't try it on Mac.
To format code, run scons fmt
(you'll need clang-format installed).
ctl::TaskQueue thread relies on core::Timer wakeuptimer
Stupid me, I forgot this! That's a problem!
core::Timer implicitly uses real clock. We pass absolute timestamps to it (deadline
), and it uses core::Semaphore to wait until that time comes on real system clock.
So to make tests independent on real clock, it's not enough to replace core::timestamp() call, we should replace core::Timer calls too.
I suggest the following:
Thanks for the hint!
My schedule's a bit tight toward the end of the year but I will look into it asap.
:umbrella: The latest upstream change (presumably these) made this pull request unmergeable. Please resolve the merge conflicts.
School was a bit crazy in the previous months. If this issue is still open, I still have some questions and would like to discuss possible solution.
Is core::TaskQueue ever instantiated? Or we only instantiate its derived classes (ControlLoop, TestTaskQueue, etc) ?
in TaskQueue, use ITimer instead of Timer
There's only one Timer field in TaskQueue, which is
core::Timer wakeup_timer_;
I have replaced Timer wakeuptimer with ITimer wakeuptimer in TaskQueue.
core::ITimer wakeup_timer_;
ITimer is an interface here. wakeuptimer is not visible to derived class so I'm planning to make it protected instead of private.
in ControlLoop, pass Timer as ITimer implementation to TaskQueue
So, does this mean that for each derived class with TaskQueue as base class, I will need to pass a reference of core::ITimer type (core::Timer or ctl::TestTimer) to the derived class's constructor?
For example, for ControlLoop we will have a constructor
// Specifically, timer is core::Timer
ControlLoop(ITimer& timer)
Or in TestTaskQueue
// test_timer is our custom implementation of ITimer in test_task_queue.cpp
// Let that be ctl::TestTimer
TestTaskQueue(ITimer& test_timer)
Thank you!
School was a bit crazy in the previous months. If this issue is still open, I still have some questions and would like to discuss possible solution.
@4d696e6820427569 Good to hear from you! Yep, the issue is still open, you're welcome.
Is core::TaskQueue ever instantiated? Or we only instantiate its derived classes (ControlLoop, TestTaskQueue, etc) ?
IIRC, it's never instantiated, it's a base class only.
in TaskQueue, use ITimer instead of Timer
There's only one Timer field in TaskQueue, which is
core::Timer wakeup_timer_;
I have replaced Timer wakeuptimer with ITimer wakeuptimer in TaskQueue.
core::ITimer wakeup_timer_;
Yep, I meant that timer.
ITimer is an interface here. wakeuptimer is not visible to derived class so I'm planning to make it protected instead of private.
in ControlLoop, pass Timer as ITimer implementation to TaskQueue
So, does this mean that for each derived class with TaskQueue as base class, I will need to pass a reference of core::ITimer type (core::Timer or ctl::TestTimer) to the derived class's constructor?
For example, for ControlLoop we will have a constructor
// Specifically, timer is core::Timer
ControlLoop(ITimer& timer)
Or in TestTaskQueue
// test_timer is our custom implementation of ITimer in test_task_queue.cpp
// Let that be ctl::TestTimer
TestTaskQueue(ITimer& test_timer)
Yeah, this should work. Though, probably a bit cleaner solution would be to add pure virtual method to TaskQueue like:
virtual ITimer& timer() = 0;
...and implement it in derived classes. This way, we'll have only one timer field (in derived class) instead of two timer fields (ITimer in base class + Timer in derived class). Not a big deal really, but IMHO looks a bit better.
virtual ITimer& timer() = 0;
This way, we'll have only one timer field (in derived class) instead of two timer fields (ITimer in base class + Timer in derived class)
I think it sounds cleaner as well with only one core::ITimer
field. I'm still a bit confused of how I should do this though.
When we changed wakeup_timer_
from core::Timer
to core::ITimer
we run into a problem.
In TaskQueue
, some functions/methods were implemented using wakeup_timer_
. For instance,
void TaskQueue::run() {
// This is called when a derived class's instantiated as a runnable
// thread. There are functions/methods in TaskQueue similar to this as well.
//....
wakeup_timer_.wait_deadline();
//...
}
void TaskQueue::stop_and_wait() {
//...
wakeup_timer_.try_set_deadline(0);
//...
}
In derived classes, TestTaskQueue
for instance, this won't compile unless we explicitly state the exact type of wakeup_timer_
. Because core::ITimer
is an interface (or abstract class). Let's say in ControlLoop
we call
virtual core::ITimer& timer() = 0;
in wake_up_timer_
place. How does ControlLoop
know the exact type of core::ITimer
here?
I'm thinking of making all functions that use wakeup_timer_
in TaskQueue
virtual and have the derived classes implement them.
I also assume we're not enabling RTTI here.
Let me know your thoughts. I will continue digging around to see what works. Thank you for your time and guidance!
Let me demonstrate what I mean:
class TaskQueue {
//....
protected:
virtual core::ITimer& timer() = 0;
//....
// no wakeup_timer_ field anymore
};
void TaskQueue::run() {
//....
// using timer() instead of wakeup_timer_
// timer() is actually implemented in derived class
// so it depends on derived class what specific type of timer we use here
timer().wait_deadline();
//...
}
void TaskQueue::stop_and_wait() {
//...
// same here
timer().try_set_deadline(0);
//...
}
class ControlLoop : public TaskQueue {
//....
protected:
// implement method defined in base class
virtual core::ITimer& timer();
//....
private:
// wakeup_timer_ field is now here
core::Timer wakeup_timer_;
};
core::ITimer& ControlLoop::timer() {
return wakeup_timer_;
}
.....
// in tests:
class TestTaskQueue : public TaskQueue {
//....
protected:
// implement method defined in base class
virtual core::ITimer& timer();
//....
private:
// here we use mock timer
// MockTimer also implements core::ITimer
MockTimer mock_timer_;
};
core::ITimer& TestTaskQueue::timer() {
return mock_timer_;
}
This pattern is called template method (no connection with C++ templates though).
Some CI tests from roc_pipeline
are failing (not at the same test) due to calling pure virtual function. From the debug trace it seems to happen in TaskQueue::run()
. Maybe it's calling pure virtual function in a class's destructor/constructor somewhere leading to undefined behavior.
I'm playing around with this solution.
class TaskQueue {
//....
protected:
// No longer pure virtual.
virtual core::ITimer& timer();
core::Timer wakeup_timer_;
};
void TaskQueue::run() {
//....
timer().wait_deadline();
//...
}
void TaskQueue::stop_and_wait() {
//...
timer().try_set_deadline(0);
//...
}
core::ITimer TaskQueue::timer() {
return wakeup_timer_;
}
class ControlLoop : public TaskQueue {
// No wakeup_timer_ and timer() since it's inherited directly
// from TaskQueue.
};
.....
// in tests:
class TestTaskQueue : public TaskQueue {
//...
public:
// Making this public since I want to call it
// in CppUTest.
virtual core::ITimer& timer();
//....
private:
// 2 timers here which is confusing but at least it's
// not failing due to calling pure virtual function.
MockTimer mock_timer_;
};
core::ITimer& TestTaskQueue::timer() {
return mock_timer_;
}
It's getting timeout for some reasons. I suspect I did not update the mock timer correctly. On MacOS machine it's also failing
TEST roc_pipeline
src/modules/roc_core/target_darwin/roc_core/semaphore.cpp:75: error: roc_panic()
ERROR: roc_core: semaphore: semaphore_wait(): (os/kern) invalid name
Backtrace:
#1: 0x10f32e32d roc::core::print_backtrace()+0xbd
#2: 0x10f32f0f1 roc::core::crash(char const*)+0x71
#3: 0x10f33589c roc::core::panic(char const*, char const*, int, char const*, ...)+0x40c
#4: 0x10f32ccab roc::core::Semaphore::wait()+0xfb
#5: 0x10f32bb7e roc::core::Timer::wait_deadline()+0x8be
#6: 0x10f212fd7 roc::ctl::TaskQueue::run()+0x377
#7: 0x10f33273b roc::core::Thread::thread_runner_(void*)+0x15b
#8: 0x7fff7111d109 _pthread_start+0x94
scons: *** [test/roc_pipeline] Error -6
......................
Error: Process completed with exit code 2.
Do you have any advice on debugging / setting break points? I usually just invoke this command and put print statement. I'm using vim as my main code editor as well.
scons -Q test --enable-werror --enable-debug --sanitizers=all --enable-tests --build3rdparty=openfec
Also, can we close this PR and let me create a new PR instead? This old PR is quite behind from remotes/origin/develop. Thank you so much!
Hi!
Do you have any advice on debugging / setting break points? I usually just invoke this command and put print statement.
See here on how to invoke test binaries manually: https://github.com/roc-streaming/roc-toolkit/blob/develop/docs/sphinx/building/developer_cookbook.rst#running-tests-manually
You can invoke it under debugger (gdb, cgdb, gdb integration for vim, etc).
Also, can we close this PR and let me create a new PR instead? This old PR is quite behind from remotes/origin/develop.
Sure, you can close it, or you can just rebase your branch on fresh develop and force-push it. Actually there is no difference - github will detect force-push and update everything correctly.
ERROR: roc_core: semaphore: semaphore_wait(): (os/kern) invalid name
I suggest to google the part of this error (os/kern) invalid name
and try to investigate when semaphore_wait()
can report it. Probably we're invoking it after semaphore destruction or before its initialization.
Some CI tests from roc_pipeline are failing (not at the same test) due to calling pure virtual function. From the debug trace it seems to happen in TaskQueue::run(). Maybe it's calling pure virtual function in a class's destructor/constructor somewhere leading to undefined behavior.
Yep, you should ensure that timer() (and other virtual methods) are not called from ctor/dtor. You can achieve this by ensuring that we always call stop_and_wait() before destroying the derived class which implements timer() (e.g. the derived class dtor is calling stop_and_wait() before proceeding to the base class dtor) and also that we never access timer() after stop_and_wait() call.
Closing this for now, feel free to reopen if you'll have a chance to continue work.
Initial PR for #435 I mainly want to check for the CI build status.
I built and ran the tests on a MacOS machine and a arch-based linux virtual machine.
Notes: ctl::TaskQueue thread relies on core::Timer wakeuptimer core::Timer has a core::Semaphore field. On MacOS machine, core::Semaphore depends on core::Timer::timestamp().
Edit: Failed CI. Investigating.