ApolloAuto / apollo

An open autonomous driving platform
Apache License 2.0
25.05k stars 9.68k forks source link

bounded_queue Enqueue bug #13156

Open Tao-Fu opened 3 years ago

Tao-Fu commented 3 years ago

Describe the bug cyber/base/bounded_queue.h has an bug in its Enqueue method.

template <typename T>
bool BoundedQueue<T>::Enqueue(T&& element) {
  uint64_t new_tail = 0;
  uint64_t old_commit = 0;
  uint64_t old_tail = tail_.load(std::memory_order_acquire);
  do {
    new_tail = old_tail + 1;
    if (GetIndex(new_tail) == GetIndex(head_.load(std::memory_order_acquire))) {
      return false;
    }
  } while (!tail_.compare_exchange_weak(old_tail, new_tail,
                                        std::memory_order_acq_rel,
                                        std::memory_order_relaxed));
  pool_[GetIndex(old_tail)] = element;
  do {
    old_commit = old_tail;
  } while (cyber_unlikely(!commit_.compare_exchange_weak(
      old_commit, new_tail, std::memory_order_acq_rel,
      std::memory_order_relaxed)));
  wait_strategy_->NotifyOne();
  return true;
}

Line pool_[GetIndex(old_tail)] = element; should be pool_[GetIndex(old_tail)] = std::move(element);

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Smartphone (please complete the following information):

Additional context Add any other context about the problem here.

daohu527 commented 3 years ago

Yes, you‘re right !

element in Tbool BoundedQueue<T>::Enqueue(T&& element) is a lvalue, then we should use std::move to make it a rvalue.

Tao-Fu commented 3 years ago

@daohu527 glad that you made the effort to fix this. Actually, the similar issue is also found here

template <typename T>
bool BoundedQueue<T>::WaitEnqueue(T&& element) {
  while (!break_all_wait_) {
    if (Enqueue(element)) {
      return true;
    }
    if (wait_strategy_->EmptyWait()) {
      continue;
    }
    // wait timeout
    break;
  }

  return false;
}

The enqueue will be called multiple times when using certain wait_strategy, the copy constructor of element will be called multiple times in that function.

daohu527 commented 3 years ago

@Tao-Fu Thanks again, it has been fixed !

Tao-Fu commented 3 years ago

@daohu527 you introduced another bug in the fix in the WaitEnqueue function. By using if (Enqueue(std::move(element))) with yield strategy, the second call to this statement will have a meanless element

daohu527 commented 3 years ago

std::move(element) will destroy until do move constructor/assignment operator, link so, can you describe your testcase?

  do {
    new_tail = old_tail + 1;
    if (GetIndex(new_tail) == GetIndex(head_.load(std::memory_order_acquire))) {
      return false;                        ----  \\ if false there is no assignment so element is still exist
    }
  } while (!tail_.compare_exchange_weak(old_tail, new_tail,
                                        std::memory_order_acq_rel,
                                        std::memory_order_relaxed));
  pool_[GetIndex(old_tail)] = std::move(element);        ----  \\ do assignment so element is destroyed

in the while

template <typename T>
bool BoundedQueue<T>::WaitEnqueue(T&& element) {
  while (!break_all_wait_) {
    if (Enqueue(std::move(element))) {  // if false then continue and element is alive, if true then detoryed
      return true;
    }
    if (wait_strategy_->EmptyWait()) {
      continue;
    }
    // wait timeout
    break;
  }
daohu527 commented 3 years ago

I write a disscuss on stackoverflow, and it better not use std::move in a loop.

So there is not much meaning to keep this interface ?

bool BoundedQueue<T>::WaitEnqueue(T&& element) {

Tao-Fu commented 3 years ago

Sorry for the late response. That's what I mean. Putting std::move in loop is a potential bug especially for non-trivial objects like std::vector. The real way to solve this is to copy paste the Enqueue code inside the WaitEnqueue function. 最好先把WaitEnqueue里面的std::move改回去,之前只是效率问题,现在是真的一个bug。

Tao-Fu commented 3 years ago

@daohu527 I also feel that the interface design of this class is problematic. It tries to put too much things into one class. Check this issue, it has a deadlock if not used properly.

daohu527 commented 3 years ago

seems WaitEnqueue is not used for now.

If we can't pass a rvalue then this interface is useless, seems it's better to delete it.

template <typename T>
bool BoundedQueue<T>::WaitEnqueue(T&& element) {
  while (!break_all_wait_) {
    if (Enqueue(std::move(element))) {  // if false then continue and element is alive, if true then detoryed
      return true;
    }
    if (wait_strategy_->EmptyWait()) {
      continue;
    }
    // wait timeout
    break;
  }

we should just use bool BoundedQueue<T>::WaitEnqueue(const T& element).

Tao-Fu commented 3 years ago

The solution looks good to me.

daohu527 commented 3 years ago

Then we should look for the other problem and fix together.

storypku commented 3 years ago

Thanks all for the discussion. I will look into this issue and @daohu527's PR in the next few days.

storypku commented 3 years ago

seems WaitEnqueue is not used for now.

If we can't pass a rvalue then this interface is useless, seems it's better to delete it.

template <typename T>
bool BoundedQueue<T>::WaitEnqueue(T&& element) {
  while (!break_all_wait_) {
    if (Enqueue(std::move(element))) {  // if false then continue and element is alive, if true then detoryed
      return true;
    }
    if (wait_strategy_->EmptyWait()) {
      continue;
    }
    // wait timeout
    break;
  }

we should just use bool BoundedQueue<T>::WaitEnqueue(const T& element).

It seems use std::forward<T>(element) (perfect fowarding) is a better choice than std::move.

daohu527 commented 3 years ago

@storypku thanks for you reply!It is better not to use rvalue reference in the loop, because Enqueue may delete the rvalue, if not it will be a unspecified state.

There is a disscuss on stackoverflow

storypku commented 3 years ago

@storypku thanks for you reply!It is better not to use rvalue reference in the loop, because Enqueue may delete the rvalue, if not it will be a unspecified state.

There is a disscuss on stackoverflow

I have read the discussion and realized what you mean. What I mean is that, inside a function with prototype func(T&& t), using std::forward<T>(t) seems a better choice than std::move(t).

daohu527 commented 3 years ago

Yes, but this function is little special

universal reference

std::forward<T>(t) used for a universal reference. Then we can combine bool BoundedQueue<T>::WaitEnqueue(T&& element) and bool BoundedQueue<T>::WaitEnqueue(const T& element) to just one.

bool BoundedQueue<T>::WaitEnqueue(T&& element) {
  Enqueue(std::forward<T>(element));
}

special case

there is a special case in the template function. link

Sometimes you can see T&& in a function template declaration where T is a template parameter, yet there’s still no type deduction. Consider this push_back function in std::vector:[3]

template <class T, class Allocator = allocator<T> >
class vector {
public:
    ...
    void push_back(T&& x);       // fully specified parameter type ⇒ no type deduction;
    ...                          // && ≡ rvalue reference
};

Here, T is a template parameter, and push_back takes a T&&, yet the parameter is not a universal reference!

So if we already know it's an rvalue, it's suitable to use std::move