boostorg / thread

Boost.org thread module
http://boost.org/libs/thread
199 stars 161 forks source link

Add CircleCI build #213

Closed thughes closed 6 years ago

thughes commented 6 years ago

As discussed in #212.

Some notes:

Still seeing timing issues with some of the lock measurement timing, especially on macOS VMs. I think something like this would fix it if it's measuring the same thing you were originally intending to measure (time to acquire the lock): https://github.com/thughes/thread/commit/d9536b82e0328cf1ce09b68a86d8404ea4c6c5d5. The issue that I've seen is that the boost::this_thread::sleep_for can oversleep (as allowed) on the VMs, so just subtracting the requested sleep time of 250ms in the calculation will not give an accurate result.

viboes commented 6 years ago

You are right the 250ms in this case wrong as the thread is sleeping for 250ms, so at least we should add 50ms.

However your changes will measure the time for the unlock()+join() not the try_to_lock.

thughes commented 6 years ago

Hmm, I don't think so. I'm just doing the calculation in the main thread after the join to avoid concurrent access to t1.

t0 is taken just before the main thread calls unlock and t1 is taken in the second thread immediately after it acquires the lock; it's not waiting for the join.

viboes commented 6 years ago

The test measured the time take by a successful call to

boost::unique_lock<boost::mutex> lk(m, boost::try_to_lock)

Your measure the time for

   m.unlock();
   t.join();
thughes commented 6 years ago

Yes, technically it includes the m.unlock as well as the time for the context switch to the other thread triggered by join.

You are right the 250ms in this case wrong as the thread is sleeping for 250ms, so at least we should add 50ms.

Where does the 50 ms come from?

viboes commented 6 years ago

I use to give 50ms to the cost of an operation. It is often too much, but this is what I've taken as delta.

thughes commented 6 years ago

How about trying to measure the sleep time and subtract that?

time_point t3 = Clock::now();
boost::this_thread::sleep_for(ms(250));
time_point t4 = Clock::now();

It also adds overhead, but it seems like it could be more precise than than a constant 50 ms.

viboes commented 6 years ago

Yes, t4-t3could replace the 250ms. We will need then to check for t4-t3+50ms. This will measure the time for the unlock() plus boost::try_to_lock

thughes commented 6 years ago

Could we could move t4 after the m.unlock() in the main thread to try to account for it directly instead of using 50ms?

I'm just trying to avoid having semi-arbitrary constants that vary by both OS and processor in there combined with making the test case rely on another semi-arbitrary threshold (max_diff) to determine success or failure; it seems destined to have problems (or just have so much additional arbitrary time buffer as to not be useful in actually testing anything).

viboes commented 6 years ago

I understand your position.

The 50ms is something like a QoI. If an operation takes more than that, there is surely a problem. On some platforms (e.g. VM) I believe those tests should be warnings.

I don't think moving the measure will change anything. The operation you want to measure should take some time. This is what these tests are measuring. How long do you think is acceptable for a call to a try_lock once the mutex is unlock? It is clear that this depends on how busy the machine is.

A PR that allows to configure at compile time whether these kind of timing checks are errors or warnings would be a good idea.

An alternative idea is to configure at compile time the time acceptable for an operation.

With these PR, we could be able to change the delta acceptable or move those checks to warnings.

Would you be interested in such a PR?

viboes commented 6 years ago

Someone in the boostorg group will need to enable the build on CircleCI for it to start building against the boostorg/thread repo: https://circleci.com/gh/boostorg/thread

It seems that we have already do that, as the Boost release are built already on CircleCI. What I could do?

viboes commented 6 years ago

Boost.Geometry is already using CircleCI. https://circleci.com/gh/boostorg/geometry/tree/master

What should I do from my side?

viboes commented 6 years ago

CircleCi doc says to create a .circleci/config.yml file, not a circle.yml file

viboes commented 6 years ago

I'll merge it to see if I reaches to setup the account.

thughes commented 6 years ago

"boostorg/geometry" is a separate project from the perspective of CircleCI (each repo in the "boostorg" is a separate project), so it will have to specifically be enabled on this page with someone who is part of the "boostorg" team/org: https://circleci.com/gh/boostorg/thread.

CircleCi doc says to create a .circleci/config.yml file, not a circle.yml file Correct, though circle.yml is also allowed as you can see in my repo: https://circleci.com/gh/thughes/thread/tree/feature%2Fadd_circleci

viboes commented 6 years ago

Ok. I'll try it.

https://circleci.com/gh/boostorg/workflows/thread

viboes commented 6 years ago

It works. The duration is very good.

Now it is time to fix the timing issues.

Thanks a lot, Vicente

viboes commented 6 years ago

Do you have a PR where you apply only once this change? I will be interested in. I'll be out for 1 week.

viboes commented 6 years ago

BTW, my changes are not enough. I have yet some issues on some tests that I'll look at later on. I've been forced to re-run 1 job to get a green CI :(

https://circleci.com/workflow-run/ab368099-02d9-4030-a82c-6cc0e3988204

thughes commented 6 years ago

Here's an example of the change to subtract the measured sleep time (instead of exactly 250ms) in one of the tests: https://github.com/thughes/thread/commit/8cac50b6b7891b82b95d6efeb2a62a618debf616

thughes commented 6 years ago

@viboes ping ^

viboes commented 6 years ago

I see what you are doing, but I want to measure the time of the tested operation, not the whole test.

thughes commented 6 years ago

You could move t4 to after the m.unlock to get closer.

I guess the problem is that since on CircleCI the sleep time is highly variable (50 - 100 ms over requested time) by always subtracting exactly 250 ms, the calculation is even more wrong since that extra sleep time swamps the actual locking time.

thughes commented 6 years ago

Actually, I'm a bit confused as to how the two calculations are actually different. In the case where boost::this_thread::sleep_for(ms(250)); actually sleeps for exactly 250 ms, the two calculations are exactly the same: t1 - t0 - 250. There is some overhead from calling Clock::now() twice; perhaps that is non-negligible?

Note that I only moved the location of t1 because in the original code is actually measuring the time for lock + unlock since it measures when the unique_lock is no longer in scope. For the purposes of accurately accounting for thread sleep time, you could ignore this part of the change.

viboes commented 6 years ago

Oh, I mixed where t1,t0 are stored and where they are used.

Your algo is good.

Please, could you provide a PR?