Maximally Powerful, Minimally Useful

runarorama commented 9 years ago

This issue is for comments on http://blog.higher-order.com/blog/2014/12/21/maximally-powerful/

Leave your comments here and they will automatically show up on the post's page.

runarorama commented 9 years ago

@copumpkin has commented that this relation seems "adjunction-flavoured". Possibly related to the Galois connection between syntax and semantics. This paper looks promising

yoavrubin commented 9 years ago

It goes back even further. Every living cell has a membrane that separates it from its environment. It gave up consuming the environment but gained life. The ability to define what is "yourself" and what is "not-yourself" gives the predictability needed to operate properly. If you generalize it even more, you can say that whenever you provide more information (from membrane to regular grammers) you gain analyzability (as there's more information), but loose freedom - the freedom to provide any information you want (as you already provided it).

paulp commented 9 years ago

Tie in to variance. The more specific the input type to the function, the harder it is to use that function. When you set a type floor in one direction you set a ceiling in the other.

stevedekorte commented 9 years ago

CPUs can do any computation (highly expressive) and have replaced many categories of special purpose computing devices. Nature also chose expressivity e.g. biological systems can add working extra limbs, antennae, etc with a few gene changes. Perhaps sufficiently complex systems are not analyzable and minimizing the effort of trying variations (expressiveness) is a more powerful approach.

tarzain commented 9 years ago

I wrote a post just last week actually on exceptionally successful startups being maximally powerful but only minimally useful: link

hausdorff commented 9 years ago

If I understand this correctly, I think a general statement and a formalization are already well-studied.

Consider: in each of these examples, we model a class of systems by defining a set of constraints that each system in that class must obey. So, there are set constraints that all regular languages obey, and they are a strict subset of the set of constraints that all context-free languages obey. Right?

So, the idea that "expressiveness" and "analyzability" are opposed more or less seems to come down to a proposition that if you relax the constraints on a set of systems, then the set of systems gets larger (i.e., it's "more expressive", since the model describes more systems), but since the constraints are relaxed, some of the propositions we normally find interesting and useful are now not true of the entire class of systems (i.e., it's "less analyzable"). So, like, CFGs can express strings that regular grammars can't; but you've lost the useful property of strict, decidable equivalence.

If so, this is pretty uncontroversial.

What's sort of interesting about this piece, though, is that it's hard to know how to quantify what you trade when you relax a constraint. How much expressiveness do you gain, what is the analytic loss, and what is the practical benefit? The first two questions are essentially the domain of the metamathematics community, and it is one of the main motivations behind things like Kolmogorov complexity. It would be interesting to see a serious attempt to talk about actually quantifying the loss of structure that allows you to prove things about a class of systems when you relax a constraint, even in a simple example like CFGs vs regular grammars.

MikhailEdoshin commented 9 years ago

This reminds me of the Ashby's law of requisite variety: "If a system is to be stable the number of states of its control mechanism must be greater than or equal to the number of states in the system being controlled."

tel commented 9 years ago

@hausdorff @runarorama I think this principle is exactly consistency/completeness in logic. It shows up in all kinds of places, though. I also find flavors of it in looking at things in a category "from the initial side" versus "from the final side".

From the logic point of view, you have a tension between laws and models. A larger number of laws means more reasoning power at the expense of fewer models (possibly 0 in which case you're inconsistent). A larger number of models means greater realizability but fewer laws (possibly 0 in which case you've gained nothing through this exercise).

It's also a game of comparative linguistics—does this abstraction pay its way? Well, without it we have this set of laws and with it we have this new superset. Are things in balance such that the abstraction is realizable and the laws are meaningful? That's your tradeoff.

Actors are an interesting thing to talk about since they "have no laws". This would make me want to say that they can capture every possible program you could write without using them. This is of course possible in the case of using a single actor, but not the case if two actors are being used non-trivially. I'd suggest that this means that there aren't zero laws for Actors—just perhaps very few weak ones.

You could probably see a lot of this studying Lawvere theories as well, which, I think, will put you neck deep in the language/meaning adjunctions crowd.

runarorama commented 9 years ago

@stevedekorte It's true that CPUs (or more generally, universal turing machines) replace many special-purpose machines. But even there a reduced instruction set is easier to reason about than a complex one, paving the way for things like predictive pipelining. It also vastly simplifies compiler design. At another level, you can also view the CPU as a restriction. Instead of the "expressivity" of being able to target arbitrary special-purpose machines, you confine yourself to one class of machine.

This is the case in biological systems. I wouldn't say "biological systems" can add limbs or antennae at will. That's unnecessarily vague. It's populations that do this. And they are able to do this because the individual organism is generated by a program (its genome) encoded in strings drawn from a set of four instructions. This restriction gives the higher-level system (the population) a great deal of flexibility. But even there we have a restriction. A population can't mutate completely at random or exactly as necessary. It's confined to explore its neighboring genotype space along paths that map to stable configurations in phenotype space. But this restriction is a boon at the even higher level, of evolution of species and their ability to fill niches and populate new environments.

@tel I think the consistency/completeness tradeoff is an apt one. Thanks for that. But even within complete logics, and within consistent logics, the principle holds. Take the actor example. A message-passing concurrent model is "more expressive" than, say, a strict functional model. Both of these models are turing equivalent, but the strict functional model is simpler to reason about (and more compositional).

tel commented 9 years ago

@runarorama I want to say there's a way to formalize the idea that the message passing model is a larger theory and thus has more of a burden to have laws, but eschews them for more additional models. This idea of "language extension" feels a bit like what happens when you extend a type theory with specific new type formers.

emil-mi commented 9 years ago

This is related: Gödel's incompleteness theorems

oranda commented 9 years ago

You might be interested in Pirsig's Levels of Quality. See the book Lila. There are many examples of how a weakness at one level may be a strength at the next level up, and vice-versa.

tac-tics commented 9 years ago

An example probably less familiar to programmers but interesting and likely related is the notion of exactness in homological algebra. I'll give a short sketch of the idea that might be accessible if you have a year or two of linear algebra:

A matrix is a way to represent a linear map from R^n to R^m. With every matrix, there is associated spaces: the nullspace, which is the subset of R^n (the inputs) which when multiplied by the matrix gives you the zero vector, and the columnspace, which is the subset of R^m (the outputs) which you can actually get as outputs.

Take my word for it when I say that many problems in mathematics boil down to having a sequence of matrices M1, M2, M3, ... with the very special property that if you take any two adjacent matrices and multiply them, the result is the zero matrix. Such a sequence is called a chain complex.

What the the "two adjacent matrices give zero" property really means is that the columnspace of the first matrix is contained in the nullspace of the second. This means you multiply one vector by the first matrix, and the result is a vector that gives you zero when you hit it with the second matrix. In summary: the columnspace is contained in the nullspace.

There's an possibility that the columnspace isn't merely contained in the nullspace, but is actually equal to it. That is, the outputs of multiplication by the first matrix are exactly the inputs that give you zero when you multiply the next one. If a chain complex has this exactness property, we call it an exact sequence.

Now to get to my point. If you find yourself working with an exact sequence, there's definitely a feeling of this minimum-maximum adjointness going on. The "wider" the columnspace of one matrix, the wider the nullspace of the next one. But because of the rank-nullity theorem, the wider the nullspace of that one, the "narrower" the columnspace it has.

Unlike the examples given in this blog post, the homological example here does a little more than split the world into "harder for the producer, easier for the consumer", because it also seems to comment on the consumer's consumer (whom of course acts a lot like the producer, and in the case of callback-based frameworks, is literally the same).

Back to programmers and reality. I noticed this phenomenon myself a while ago, and I've come to like explaining it to people like this: "One man's freedom is another man's responsibility". If you are free to pass crap to an API, the callee is the one responsible for processing it. If you have the freedom to produce crap, on the other hand, your consumer has the responsibility to make sense of it.

jonschoning commented 9 years ago

An example (somewhat) from the business world: integrated vs modular.

For example, Apple (ios) is integrated and they have complete freedom to design their hardware and software stack from the ground up. This allows them full expressiveness.

In contrast, Google (android) is modular, and they integrate/source components from commodity vendors. It's less expressive, perhaps a degraded experience compared to apple, but much more modular.

The interesting bit is that the traditional wisdom in business is "in the end modular approaches to technology always defeat integrated approaches", although how this will play out in Apple's case is hard to predict, because I suppose Apple is kind of special.

However if we apply the traditional business wisdom to the software domain here, I wonder the same holds: that the expressive/integrated approach is more powerful initially (where may the modular woudln't be able to get "off the ground"), but over time it looses ground to the constrained/modular approach as the modular benefits are allowed to scale to their full potential.

http://stratechery.com/2013/clayton-christensen-got-wrong/

runarorama commented 9 years ago

Some people on Twitter pointed out the connection to Hegelian dialectics and Lawvere's work in mathematically formalizing these:

http://ncatlab.org/nlab/show/Hegel%27s+%22Logic%22+as+Modal+Type+Theory http://ncatlab.org/nlab/show/William+Lawvere

hxa7241 commented 9 years ago

I was also beginning to collect these a while ago, e.g.: a module cannot be both completely described and reusable -- http://www.hxa.name/notes/note-hxa7241-20110918T1016Z.html ; a language cannot be both fully powerful and clear and predictable -- http://www.hxa.name/notes/note-hxa7241-20111113T1241Z.html .

The most fundamental essence of software -- abstraction -- has a sort-of contradictory form, and that seems to be (near) the root of all this. An abstraction unites something fixed with something varying: when you talk about an abstraction you on the one hand grasp a single fixed thing, but at the same time you are considering the freedom of the things it encompasses. An abstraction must have looseness, it must leave out something, otherwise nothing could vary; but it must also not be entirely vague, otherwise it would be saying nothing.

It seems intriguing ... but if we believe it, surely we should apply it to itself, and if we do, we are forced to accept that pursuing a very general principle will inevitably be shallow and uninteresting -- instead, the rule is telling us to take on some restriction of being more specific, and in that find more interesting substantial patterns/knowledge.

algermissen commented 8 years ago

Hi Rúnar,

reading through your post I thought you might find the though framework interesting that Roy Fielding lays out in the first half of his REST dissertation (https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm).

In the context of your post I thought you might add the 'axis' of constraining the areas of variation because that is, in my opinion, a means to reduce expressiveness without affecting usability. (REST derives much of its decoupling from absolutely minimising the areas where design activity can take place - this reminds me a lot of how mathematical patterns in functional programming are limiting variation)

Jan

PS Being a functional newbie I am extremely thankful for the effort you put into explaining things.

runarorama / runarorama.github.com

Maximally Powerful, Minimally Useful #12