exercism / cpp

Exercism exercises in C++.
https://exercism.org/tracks/cpp
MIT License
256 stars 207 forks source link

Tracking Issue for Deprecating Boost #233

Closed arcuru closed 5 years ago

arcuru commented 5 years ago

I believe we should move to make Boost a dependency for only a couple non-core exercises, while leaving the core of the track Boost-free. That would let us avoid having to help configure boost for every single platform and corner case, which just isn't going to happen.

For what we really needed Boost for, the unit test suite, we will switch to a header-only library so we can just distribute it along with the exercises, https://github.com/catchorg/Catch2 seems to be the most popular. Most platforms should then be able to compile problems without even bothering with CMake if necessary, since it'll just be a handful of self-contained files + the stdlib, and many IDEs have support for Catch2 specifically.

We will use version 2.7.0 of Catch, as that is the latest release as of today, located in a test/ subdirectory in each exercise folder.

I initially setup #159 to remove everything at once, but it ended up having to touch too much of the repo (I didn't realize how often we used boost outside of the unit test suite itself) and it's now so old as to be far too difficult to merge.

Also, I'm older and a bit smarter, so I realize we can just do it in pieces so long as we let the students know in the documentation.

What I'm proposing now is that we take the following actions in roughly this order.

Most of the grunt work is in updating the test suites. I will, at a minimum, put together an example moving the Hello World exercise over to Catch to demonstrate what to do for anyone who wants to help, and figure out what to do with an old script I had that helped me do a lot of the conversion work. There isn't a need to do the problems 1 at a time, but be somewhat reasonable as I do want 2 people to actually look at each of the changes.

If you'd like to grab some work for yourself, go ahead and claim it here or just open a PR.

arcuru commented 5 years ago

Alright, I've setup the initial set of changes. If you'd like to help in the short term, please take a look at #239, #240, and #244 and let me know if you have any issues or thoughts.

Once I get a little feedback I can merge all 3 of those and we'll be on our way :)

If you might be interested in helping to update exercises, please reply here and I can write up a short tutorial on how to do it relatively easily, (it's just unclear right now if anybody is willing to help, so I want to be sure it'd actually be used before I spend the time).

arcuru commented 5 years ago

I'll leave these open for feedback for another week. If nothing comes up I'll merge in those initial 3 PRs on April 7th/8th, which will switch over all of the core exercises to Catch.

The only real downside I've encountered is that the Catch builds take longer since it needs to be compiled every time, to the point that it adds ~7 seconds on each compile. Some header only test suites are faster (doctest, for instance) but Catch is by far the most used. I think it's a fair tradeoff to make for an easier setup.

elyashiv commented 5 years ago

I didn't actually tried, and I'm not sure how portable it will be, but there is an option for pre-compiled headers. Did you look into this?

arcuru commented 5 years ago

Precompiled headers aren't really that portable, and I'm not quite sure how setting them up in CMake would work.

There does seem to be a cmake module that will do some of that for us though, which might be worth looking into at some point - https://github.com/sakra/cotire

Of note, Visual Studio compiled it in about half the time as clang/gcc.

arcuru commented 5 years ago

Alright, the kickoff PRs have been ready for ~2 weeks with no objections, so I have gone ahead and merged them. I'll make a tracking issue for onboarding the rest of the exercises.

Since it seems I'll be doing these myself, I will likely do them in batches and leave each batch up for a week or two for comments, then do the merge if I hear no objections. Will probably take a month+ unless I feel like devoting a ton of time.

With the exception of the already opened PRs, I won't be looking at new problems nor will I support merging new problems that use Boost at all.

KevinWMatthews commented 5 years ago

Hi @patricksjackson , I've been keeping abreast of the Catch PR's and things look good from afar. I'm trying things out in detail now :) Unfortunately I'll be traveling for work for most of next week, but I'll still try to keep an eye on things.

arcuru commented 5 years ago

@KevinWMatthews No problem. I'll leave things up for a week or two just in case you or others are around and want to comment, but unless something comes up in that time or I think something is questionable I'll just push things through myself.

KevinWMatthews commented 5 years ago

@patricksjackson Sounds good!

Unexpected side effect: the entire catch.hpp file is displayed on the Exercism website for each solution. This actually slows down the webpage significantly while the browser is downloading and rendering ~15,000 lines of the test suite. There are several seconds in which the webpage as a whole is unresponsive.

Do you experience the same thing? My laptop is on the older side, but it's still concerning.

To repeat you'll need to browse to your Exercism C++ track page, then click "Update exercise to latest version". A refresh/full refresh is symptomatic for me.

This is unfortunate - the pages for the old exercises are pretty snappy. Is there a way that we can prevent Exercism from displaying the Catch header file?

arcuru commented 5 years ago

You should have been paged with a Github notification a few minutes ago for exercism/exercism#4797, I saw that problem as well and am working on it :)

Short answer is I don't know how to fix it, and the C track has the exact same problem.

KevinWMatthews commented 5 years ago

Not necessarily our issue, but another gotcha: if a student has submitted a Boost-based *_test.cpp file as part of their solution, they will not be able to download the new Catch-based test file. Apparently Exercism won't overwrite part of your solution.

I did this on Triangle; triangle_test.cpp was part of my solution. I must've run:

$ exercism submit triangle.h triangle.cpp triangle_test.cpp

This can be resolved be resubmitting the exercise without the test file. In my case,

$ exercism submit triangle.h triangle.cpp

Then exercism download grabs the Catch test file as expected.

Let me know if there is somewhere that I can/should make this visible to mentors.

arcuru commented 5 years ago

Hmm, that's unexpected. Does it update everything other than the test file, like CMake, so you're left in a broken state?

That's an interesting issue, though I'd consider it a corner case. It might be worth bringing up with the exercism cli team, since that would block updating the test suite for anybody hitting that. If we document fixing that it should be in a central place since it would impact everyone.

KevinWMatthews commented 5 years ago

Yes, that's correct - only the test file was untouched and I was not able to compile.

I'll search through the cli repo; perhaps they've come across this already.

KevinWMatthews commented 5 years ago

Couldn't find much about it so I've opened exercism/cli 841.

arcuru commented 5 years ago

Ok, I've looked into the issues arising from the increased compile times. Catch2 knows it has slow compiles, and has information in their docs about how to alleviate it. The main thing to note is that the majority of the extra time comes from compiling with CATCH_CONFIG_MAIN, which is only done once for each test binary, and just including the catch.hpp header and adding tests is not too expensive.

This impacts in 2 different places, the UX and the CI server.

First the user experience. There's no way around the long compile time on the initial compile, however we only recompile the test compilation unit with the Catch main config if the test file changes or if the exercise header changes. Once the user has setup a stable header, then the recompiles are quick. For example, on my system changing hello_world.h or hello_world_test.cpp makes the compile take 7 seconds, but only touching hello_world.cpp takes <0.5s.

I could make this better by adding a third compilation unit into the test/ folder for every exercise, then it would be linked in and never need to be recompiled even when editing the header. That would make header-only exercises a lot less painful and would help with one of the potential CI solutions, which I explain below.

Second, the switch is making us be bad citizens on the CI server. Travis CI provides some resources for free for OSS projects, but Exercism shares resources, so there will start to be an issue if we're unnecessarily expensive. Because there's no way (yet) around compiling the header for every individual test, Catch adds a fairly significant amount of time to the builds. I pulled together a branch on my repo to test the Travis compile times, and saw a total time of over 30 minutes, though that is spread over 4 different builds. That is an increase from the ~5-10 minutes of resources we used a month or so ago. About 21 total minutes is building and running the tests.

There are two ways to use less resources on Travis, test fewer things and/or make a unity build to only compile Catch Main once.

Testing fewer things I'm addressing in #266. Using a unity build instead of compiling each exercise is riskier, since it wouldn't exercise the CMake configuration of each exercise (I'd likely set it up to still at least run the CMake configuration for everything), but we could always set it up so we could manually run the more expensive builds of everything and use the unity build by default. I might do this in the future if this continues to be an issue after we already moved to partial builds.

KevinWMatthews commented 5 years ago

Couldn't find much about it so I've opened exercism/cli 841.

Close this issue - it is expected cli behavior. If a student submits modifications to the test suite, they want to preserve the student's work.

KevinWMatthews commented 5 years ago

Good point about using lots of resources on Travis. If someone complains and we need an immediate fix, we could use one compiler instead of four until we figure out how to reduce compilation time.

The UX is a little worse than one might expect. The docs/TDD recommend to write enough code to get a single test to pass, then uncomment tests one at a time. This workflow causes the long recompile multiple times - once for each test. Not terrible, but more than once.

KevinWMatthews commented 5 years ago

Can we pass CMake options to a specific compiler in travis.yml? If so, we could create a "unity build" option and enable for three out of the four compilers (the slow ones). clang++7 ran the full build in ~5 minutes, which isn't so bad.

Thoughts on a unity build implementation (assuming we decide to use one):

It looks like Catch2 compiles a generic tests-main object file:

$ g++ tests-main.cpp -c

and then compiles the individual test file against this:

$ g++ tests-main.o tests-factorial.cpp -o tests

Thinking in terms of Make (not CMake): I wonder if we could create a tests-main.o target at the root (maintainer) level and also in each exercise. If we could get the root level to compile tests-main.o and put the .o file somewhere that the exercises could find it, each exercise should compile quickly; the object file won't be recreated.

If we also put a target to compile tests-main.o in each exercise, students would still be able to compile and run. The .o file won't already exist but there is a rule to create it.

Theoretically. Can't remember if Make allow targets to be defined twice.

A similar approach may apply to CMake. I know that we can define a custom command that generates an output file, but I'm pretty sure that CMake doesn't allow targets to be defined twice. They also warn about defining the same output file in multiple locations - issues with parallelism.

We might be able to get around this with conditionals. Create a "unity build" CMake option so that the custom tests-main.o command is defined either in the root or in the exercises. Hmmm, this could still be tricky; I need to sit down and try it.

arcuru commented 5 years ago

On user UX - I tried doing the binary-search-tree exercise this weekend, and because that's a header file it needed to compile everything on every change. That was painful enough I very quickly just edited CMake to compile Catch separately. After I finish the merge this weekend I'll be going and adding a separate test-main object. (Adding a test/test-main.cpp file with the main config in it.)

Doing the unity build with 3 of the 4 compilers is a very good idea, I didn't think of that. That would exercise all the compilers against all of our code, which is what we want all the different compilers for, and exercise the individual test setups using the modern clang. I like that idea, it resolves the issue I had with adding a unity build.

My thinking for how to create a Unity build was to just make a test-main.cpp file separate from the exercises for Catch main, and glob together a build with that and all exercises/*/*.cpp/*.h files. That would use a different CMake target than everything else, so it wouldn't exercise the CMake files from the individual tests.

If you could figure out how to compile a root test-main object and insert it into each exercise setup, that would be the best path and I will personally petition to change your title to "CMake Wizard".

We can choose whether to use a Unity build or not by adding an extra environment flag in the travis matrix and checking that value in the script in #266, and just running the tests from the script. So different settings for different compilers isn't a problem.

arcuru commented 5 years ago

Also I feel I should just make it clear, the unity build stuff is by no means urgent. I'll fix the student UX this weekend, and merge in #266 to help with Travis resource usage. Sorting out the unity build or lack thereof doesn't need to happen along with finalizing the Catch migration, it can wait.

KevinWMatthews commented 5 years ago

Makes sense - we'll straighten out the students' experience first, then clean up the maintainer's experience as time permits.

Just wait until we find a real CMake wizard... ;) I'll see if we can't get CMake to do our bidding, but I might try to read through the new documentation first, etc.

arcuru commented 5 years ago

Yea, getting feedback on the docs changes would be more beneficial right now.

Actually I just tried the merged approach, and successfully compiled everything with a single catch main object. Obviously this is hacky, (I don't even specify compile flags for the lib?) but it's a proof of the concept at least.

Timing-wise with 2 threads to mimic Travis - it took 30s vs ~3 minutes.

  1. Add a test/test-main.cpp file and remove CATCH_CONFIG_MAIN from the test files.
  2. Add a _test directory in the root with test-main.cpp and catch.hpp inside.
  3. Add the following snippets to CMake files:

Root CMake

option(EXERCISM_COMMON_CATCH "Link against a common Catch2 main lib." On)
# Configure to run all the tests?
if(${EXERCISM_COMMON_CATCH})
    add_library(catchlib OBJECT _test/test-main.cpp)
endif()

Each Exercise:

# Configure to run all the tests?
if(${EXERCISM_COMMON_CATCH})
else()
    add_library(catchlib OBJECT test/test-main.cpp)
endif()

# Build executable from sources and headers
add_executable(${exercise} ${file}_test.cpp ${exercise_cpp} ${file}.h $<TARGET_OBJECTS:catchlib>)
KevinWMatthews commented 5 years ago

Awesome! Using an object library looks like a very good approach.

Might be cleaner to use if(NOT) instead of if() else():

if(NOT ${EXERCISM_COMMON_CATCH})
    add_library(catchlib OBJECT test/test-main.cpp)
endif()

in each exercise, but that's preference.

We should be able to add flags/options to the catchlib object library without too much issue.

arcuru commented 5 years ago

Alright, this is all done so I'll be closing. I've opened #272 to track the work discussed here about speeding up the CI builds.