justinmeiners / efficient-programming-with-components

Course notes for Alexander Stepanov's teachings on design and usage of C++ STL.
https://www.jmeiners.com/efficient-programming-with-components/
74 stars 6 forks source link

Chapter 10 suggestions #46

Closed aharrison24 closed 2 years ago

aharrison24 commented 2 years ago

I'm about half way through chapter 10. There are some suggestions I'd like to make up to this point, because I hit a brick wall in my understanding. I've not read the second half, so apologies if my questions are answered later on. If they are then that probably suggests that a reordering might be appropriate.

Section reordering

I think it would help the narrative flow to reorder the sections a bit. I'm thinking:

They won't swap totally cleanly -- there would need to be a bit of smoothing, but it should be fairly minimal. As things are, I found things hard to follow because it wasn't clear to me where the description was headed. It starts off talking about balanced tournament trees (the 'correct' approach), then moves on to 'unoptimal divide and conquer' and then to the even-worse unbalanced tree. That seems backwards to me.

Divide and conquer section

I was initially a bit confused by the reasoning about the number of comparisons required for the divide and conquer approach. The algorithm is first introduced as if the full list of players is recursively subdivided (i.e top-down). But the numbered list counting up the comparisons is in a bottom-up order, where we start with pairs of players in the first 'round'. Essentially the recursion has already bottomed-out and we're making our way back up the call stack. I think it would be useful to make that distinction a bit more explicit if possible. Say something like "The list of players is recursively subdivided until there are only two players in each list. It's trivial to find the best and second-best players for this base case... etc...". Obviously it would need to be Alexified, but I hope you understand what I'm getting at.

Element 'power'

Originally there was a sentence saying "We can define the power of each element to be the number of games they have played.". The use of 'element' here was jarring, as the surrounding text is talking about 'players'. In my PR I suggested replacing the word 'element' with 'player'. But I still find this sentence a bit problematic. This notion of 'power' seems to come out of nowhere, and it's not obvious at this point in the text why it's even necessary. I was thinking "Why can't we just talk about the number of games they have played? Why do we need a special term?". Also, is this the number of games they have played so far, or the number of games they played before they were knocked out of the tournament? A bit more detail would be welcome.

Commutativity footnote

The commutativity footnote gives a good example with multiplication, then goes on to describe a proof of something from Dirichlet. The sentence starting "This fact about integers can be proven..." was confusing, as I wasn't sure what fact it was referring to. It wasn't clear to me what that proof is trying to show, or what value it adds to the footnote.

The algorithm that wasn't quite an algorithm

The sentence "The algorithm wasn't quite an algorithm and it was clearly not optimal." is frustratingly imprecise. I'm not really sure what it means. Were the steps described in the paper not complete, not correct, or what? At least can we avoid the overloaded use of the word 'algorithm'?

Singleton bits

The 'Binary counting and reduction' section is where I lose the thread. This part in particular:

We can create a counter and in every bit of this counter we're going to keep a singleton. In each bit we keep the person who had n victories.

Firstly I'm not sure what is meant by 'singleton'. Obviously it has a very specific meaning in programming, but I suspect that's not it. I assume it means 'a single integer value' here?

We have the notion of a 'counter' which to me means 'an integer that counts up'. But somehow the 'bits' of this integer are actually going to contain an integer ID representing a particular player? How is that even possible?

Looking at the ASCII diagrams that follow I see that the 'counter' is actually more like a list of integers. But the only example contents are 0, x and y, so it's a bit hard to generalise and understand how it is used. Presumably x and y are actually integer IDs?

The later paragraphs about laziness and numerical analysis don't really help to clarify, as they are quite abstract.

I'd really like to see a clear explanation at the start of the section about what we're aiming to do, before we dive in to the details. It would be good to clarify what is meant by the terms 'singleton' and 'counter', and it would be helpful to have a more concrete example that uses actual player IDs rather than variable placeholders (x and y).

I realise this probably goes way outside of the way Alex described things in the lectures, but I really struggled with the way things are explained currently.

aharrison24 commented 2 years ago

Sorry, I wrote that late at night. I hope it doesn't sound too grumpy :flushed:

justinmeiners commented 2 years ago

Recall that "1st and 2nd place player in a tennis tournament" is just an concrete example to understand "smallest and second smallest in a list". At some point we need to transition from talking about players, to talking about generic types with a total ordering, the simplest computer example being integers.

The example code in the next section uses integers.

I think it would help the narrative flow to reorder the sections a bit

This is tricky. I agree that the interruption of "Unoptimal divide and conquer approach" (I also just learned unoptimal is not a word) is not helpful. I don't want to put it at the beginning, because I like the approach of first proving how many steps should there should be, then looking for a solution.

I would prefer to put it either at the end, or even in a footnote, as this is a dense section. Do you have any opinion about that?

Divide and conquer section

I agree some explanation of what you mentioned before starting the analysis would be helpful.

Element 'power'

He introduces power so we have a term for when we implement the generic algorithm, which of course has nothing to do with players. I agree with the confusion about element and player, as we don't move to the generic counter idea until the next section.

Commutativity footnote

This proof is one of Alex's favorite, and depicts his style of math, but is admittedly not originally from this lecture. I agree the wording there is poor and needs fixing, but do you think the proof is out of place?

The algorithm that wasn't quite an algorithm

I tend to keep the stories at the beginning of lessons worded "as is". I can add a quote from Knuth: "it is not formulated precisely enough to qualify as an algorithm" in a footnote.

We have the notion of a 'counter' which to me means 'an integer that counts up'. But somehow the 'bits' of this integer are actually going to contain an integer ID representing a particular player? How is that even possible?

Yes, the "bits" are going to be anything you want, provided it has an ordering relation. It's called a counter because it moves elements between slots, in the same pattern that bits carry-propagate when counting. You will see in a later section he stores linked lists in them, but integers is good for now.

The counter itself is indeed magic, and I don't think focusing too much on "what it means" helps you understand what's going on. The big picture is that we are making a "balanced reduce/fold" which takes a binary operation and applies it to a list in the balanced tree order. If you have that picture in mind, that's probably as best a reader can do at this point. The counter is just a machine for doing that.

aharrison24 commented 2 years ago

At some point we need to transition from talking about players, to talking about generic types with a total ordering...

I was wondering if it would be worth having a footnote somewhere acknowledging that the assumption with the tennis tournament model is that 'superiority is transitive eg if A is superior to B and B to C, then A is superior to C' and 'superiority is deterministic, consistent and persistent'. That might make it easier to switch from the tournament analogy to talking about a 'top-n' algorithm that works on ranges of objects with a total ordering. (I got those definitions from https://www.r-bloggers.com/2020/01/lewis-carrolls-proposed-rules-for-tennis-tournaments-by-ellis2013nz/)

I don't want to put [the divide and conquer approach] at the beginning, because I like the approach of first proving how many steps should there should be, then looking for a solution.

OK that makes sense. I think the thing that bothered me was that after I'd seen the justification for the upper bound on the number of steps, that signalled very strongly what the shape of the 'correct' algorithm ought to be.

Coming in to the chapter, my own initial intuition was that a divide-and-conquer approach might work. After I'd read the crucial insight that only the winner is able to beat the second place player it became clear that you'd be better off running the normal tournament to find the winner, and then running a smaller tournament between only those players who'd played the winner. So I was puzzled when there was a deep dive in to the divide and conquer approach once I'd already been convinced that there was a better way.

I would prefer to [the divide and conquer section] either at the end, or even in a footnote, as this is a dense section. Do you have any opinion about that?

I think it should go in the main text. I think it's important to discuss. The density didn't bother me either. The issue was only around the narrative flow. I want to be lead through the arguments in an order that minimises the mental leaps I have to make at each stage. Instead I felt like I was bouncing around between ideas without a clear idea of what the end goal was.

...but do you think the [commutativity] proof is out of place?

Not necessarily. I just think it needs a bit more explanation to justify its inclusion. If it's one of Alex's favorite proofs then that alone is probably justification. It's a bit odd that it comes in a chapter where there specifically is not a requirement for commutativity. The only other place in the text that mentions commutativity is Chapter 5. Perhaps you could find a way to put it in there?

Alternatively you might consider having a short 'prerequisites' section before chapter 1, where you explain some fundamental knowledge that is assumed elsewhere? I realise that's a much larger structural change and probably not justified, but I'd thought I'd mention it as a possibility.

Either way if you keep the Dirichlet proof then it probably needs a bit more explanation because I'm still not sure what the proof is trying to prove.

I can add a quote from Knuth: "it is not formulated precisely enough to qualify as an algorithm" in a footnote.

Yes, that sounds perfect.

Yes, the "bits" are going to be anything you want, provided it has an ordering relation. It's called a counter because it moves elements between slots, in the same pattern that bits carry-propagate when counting. You will see in a later section he stores linked lists in them, but integers is good for now.

Ok... so is it conceptually similar to an integer counter with an auxilliary list data structure (containing arbitrary types) that has the same number of elements as there are bits in the counter? So the state of each bit signifies whether the corresponding list element is 'occupied'?

The counter itself is indeed magic

I think that's my biggest issue with the chapter. The tone seems to be "we're doing something magic and you can't possibly understand it without seeing it". That means I've got no intuition to act as a structure off which I hang new insights as I work through the text.

At the moment it's not even clear to me whether or not we are actually trying to follow the algorithm of performing a 'tournament' to find the 'winner' element and then a smaller tournament between only the elements that were knocked out by the 'winner'. Or was that only used to derive an upper bound on the number of comparisons that are required, and the algorithm itself has been rejected because of its excessive storage requirements?

Perhaps the counter algorithm is better because it's somehow able to run both 'tournaments' concurrently? If so, it would be useful to know that up-front. Essentially, why are we bothering to do this when we've already discussed a fairly adequate-seeming algorithm near the beginning of the chapter?

Actually I've just skipped ahead a bit and I see reference to the fact that the counter's storage requirements are log(n). That seems to be a crucial thing to know, and justifies why we're going to the effort of doing something 'magic'.

There is an earlier section that says:

Whenever guys are ready to be paired we want to pair them. So if we only store only the winner at each level, we never need to store log(n) things. We can define the power of each element to be the number of games they have played.

When I read that I remember feeling that the were a lot of different ideas packed in to a very small space, and I didn't really understand what they were getting at. I just had to skip over them and hope things would become clear later.

Also, I assume that the word 'never' is a typo and should actually be 'only'?

aharrison24 commented 2 years ago

I've just gone and had a look at the Dirichlet Commutativity proof in "From Mathematics to Generic Programming" and I understand the point of it now. What I was missing with the footnote was that we are trying to prove the commutativity of multiplication for integers by considering that multiplication can be used to compute the area of a rectangle and that the result is invariant to whether you do base x height or height x base.

The key part I was missing was that it was specifically trying to prove something about multiplication of numerical quantities, rather than about commutativity in general.

My suggestion would be that you write it something like this:


Informally, f does the same thing, regardless of the order of the inputs. For example, multiplication of integers is commutative:

a * b = b * a

In Chapter 9.1 of “From Mathematics to Generic Programming”, Alex mentions a neat geometric argument from Dirichlet, showing that multiplication of non-negative real numbers is commutative.

It begins with the observation that rectangles are defined by their height and width. If you turn a rectangle by 90 degrees then the height and width are reversed, but the area is unchanged. Or as Dirichlet put it "Whether you arrange soldiers in rows or columns, you still have the same number of soldiers":

* * * * *       * * *
* * * * *   =   * * *
* * * * *       * * *
                * * *
                * * *

An example of an operation which is not commutative is string concatenation:

"Hello, " + "World!" != "World!" + "Hello, "
aharrison24 commented 2 years ago

Alex doesn't actually say anything about 'non negative real numbers'. I added that because the argument doesn't make sense for rectangles with negative side lengths.

aharrison24 commented 2 years ago

In writing up my thoughts about Chapter 10, I've actually solved a number of my confusions. For example with the Dirichlet commutativity footnote the information was actually all there in the text, but I missed the point the first time I read it.

I once went to a course run by Jon Jagger and Kevlin Henney about 'deliberate practice' in Software Development. One concept they talked about was "How many calories you have to burn to understand a piece of code". All code is understandable if you spend long enough staring at it, but it's certainly beneficial to reduce the amount of effort it takes if you can -- future maintainers will be grateful you did!

I feel a similar argument can be applied to Chapter 10. All the information is there, but it currently takes more energy than necessary to extract it.

justinmeiners commented 2 years ago

I will get back to this soon, thanks.