ethereum / yellowpaper

The "Yellow Paper": Ethereum's formal specification
Creative Commons Attribution Share Alike 4.0 International
1.64k stars 508 forks source link

Separate axiomatic and theorematic statements, clarify formal proofs, utilize more commonplace language and reduce formalisms, integrate appendices into the main body and, appendicize more trivia (mathematical and otherwise.) #457

Open chronaeon opened 6 years ago

chronaeon commented 6 years ago

A formal proof or derivation is a finite sequence of sentences, each of which is an axiom, an assumption, or follows from the preceding sentences in the sequence by a rule of inference. The last sentence in the sequence is a theorem of a formal system. -- Cambridge Dictionary of Philosophy, 2nd ed.

The word "formal" or one of its derivations is used 55 times in the Yellowpaper, but it's frequently unclear why or how equations, when invoked correspond with formal proofs. A formal proof is made up of a list of axiomatic statements which, when taken together, provide the foundation for some theorematic statement(s) that derive from the truth of those axioms. If axiomatic base-statements are proven to be false, then the theorems resulting from them cannot be true. If these axiomatic statements do not exist or are unclear, then no theorem exists, ergo no formal proof exists.

The Yellowpaper as it stands does not clearly document its own formal in a way that makes them clear for the reader. On the contrary, it expands them arbitrarily throughout. This would not be such a problem if the symbolic mathematics invoked were clear and unambiguous in spite of being difficult. But the ad-hoc algebraic alphabet makes it more challenging than it needs to be simply because definitions, notations, and axioms are not laid out in clear order for the reader, giving them one-by-one before being taken for granted as evidence of proofs.

There are many ways to address this. The first and most obvious way would be to clearly distinguish axiomatic statements and theorematic statements, and clearly distinguish formal proofs from mere formal assertions. Formal assertions that serve as tautologies may not need to be stated at all, or if they do, they can be footnoted or drawn up in appendices. There should be for the readers a clear set of formal proofs (whichever ones are necessary to prove the ground-breaking concept(s) on which the Ethereum World Computer is based,) and everything else (notwithstanding arrays, stack functions, or other computer operations that need to be represented symbolically) can and should be written in plain English.

For particularly complicated proofs, visual aids can be used (see e.g. this screenshot from Donald Knuth's The Art of Computer Programming,) but are not necessary.

WhisperingChaos commented 6 years ago

Thanks for your supportive comment on #399.

I like you hope someone will review your and other critiques of the Yellowpaper and then decide to refactor it to improve its clarity and rigor.

jamesray1 commented 6 years ago

Hi @chronaeon, thanks for the feedback! Feel free to make PRs, e.g. using content from your almond paper. I have no idea how much resources the Ethereum Foundation has to dedicate to a revision of the Yellow Paper as you propose, so I can't comment on whether it will be done or who will do it, other than a volunteer.

pirapira commented 6 years ago

Does the Yellow Paper prove anything?

chronaeon commented 6 years ago

@pirapira What's the point of using a formal notation if it doesn't?

pirapira commented 6 years ago

@chronaeon to define something.

chronaeon commented 6 years ago

Why not just use words, then?

On Feb 2, 2018 2:40 PM, "Yoichi Hirai" notifications@github.com wrote:

@chronaeon https://github.com/chronaeon to define something.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ethereum/yellowpaper/issues/457#issuecomment-362685681, or mute the thread https://github.com/notifications/unsubscribe-auth/AUAJtz9LssRslSWs7uHxvhfl3q6RCndpks5tQ2SrgaJpZM4RpC7H .

jamesray1 commented 6 years ago

I guess that maths is a way of expressing meaning more concisely (and thus if you re-read it, it is easier to do so). You don't use maths only when you are proving something.

jamesray1 commented 6 years ago

While some symbols may initially be unfamiliar to some readers, you can have a glossary for that. And maths is less prone to ambiguity and misinterpretation.

chronaeon commented 6 years ago

@jamesray1 Exactly, it is as you have said: "Maths is less prone to ambiguity and misinterpretation." It is one thing to say that σ is the World State. It's another thing entirely to be able to prove that by reasoning about it.

jamesray1 commented 6 years ago

Remember that maths has evolved by convention, symbols in themselves don't mean anything, their meaning is relative to how they are conventionally used. However, σ usually means the standard deviation, but in the context of Ethereum it could be used as something else, like the world state. You don't really prove that σ is the world state, you just say: "let σ be the world state", and hope that others will adopt this so that it becomes a convention.

pirapira commented 6 years ago

The act of associating symbols to objects is not called proving, but defining.

chronaeon commented 6 years ago

Right you are @pirapira ; but if they're properly defined you will be able to use them in infinitely many proofs. If they're improperly defined you won't be able to use them in any proofs, which defeats the whole purpose of obtaining the mathematical specificity that James mentioned earlier.

For example:

Axiom: lowercase Greek Sigma (σ) is equivalent to Ethereum's Global State Axiom: uppercase Greek Upsilon (𝚼) is equivalent to Ethereum's State Transition Function Theorem: σₜ₊₁ ≡ 𝚼(σₜ, T) is equivalent to a Valid State Transition

This is the logic that confronts us in section 2. The only problem is though it seems like a solid theorem, it's not, it's actually just another axiom (it's not relevant that the words "theorem" and "axiom" are never used, because they are implied and assumed in the algebra) which leaves us with:

Axiom: lowercase Greek Sigma (σ) is equivalent to Ethereum's Global State Axiom: uppercase Greek Upsilon (𝚼) is equivalent to Ethereum's State Transition Function Theorem: Axiom: σₜ₊₁ ≡ 𝚼(σₜ, T) is equivalent to a Valid State Transition

...and we begin to see the crest of the problem: The equations and formal alphabet in the Yellowpaper, taken together, are a list of unsorted, unproven, yet nevertheless asserted, axiomatic statements. Don't you agree that that's a problem, when the whole point of a formal specification is to stringently demand exactness of language and meaning from every page? I'm only suggesting that the reader be taken through the steps of something resembling a proof instead of having to suspend disbelief and assume that every equation "just works" in the purely general terms they are given. If generalities are called for in a paper like this, I agree with @jamesray1 that natural language is more than adequate for the job.

ldct commented 6 years ago

I don't think the yellow paper should have any theorems at all. It is a specification for client implementers, compiler writers and researchers. The point of a specification is not really to prove interesting theorems but to unambiguously specify behaviour, and formalism/math exists to avoid ambiguity.

Here are some examples of formal specifications of VM-like things that have partially the same type of aim as what I think the yellow paper has (unambiguous reference for implementers and researchers)

the first three linked to above do not contain proofs, their main value is in definition. (For the last link there are some proofs, but the main value in the paper is in the definitions, not the proofs. It also leads to the question: what theorems do you think would be worth proving in the yellow paper?)

jamesray1 commented 6 years ago

@chronaeon initially I thought it would be handy to have words instead of some symbols, but there can be a glossary of symbols for such readers who may be unfamiliar with them, which I started (there is a glossary of math symbols at the end of the paper, with just one symbol at the moment, \bigvee). Additionally I could make a PR to add a link to Wikipedia's list of mathematical symbols, but eventually for better readability there should be a glossary with hyperlinks. #551. Once you are familiar with the symbols, it's easier to read and re-read.

You can have formal definitions without formal proofs.

I think if there are going to be proofs, then they would be for abstract things, such as a theorem for Turing completeness, and axioms/sub-theorems for properties of Turing completeness. But this isn't a high priority in my opinion, as Ethereum has already been implemented in different Turing complete programming languages with several clients.

Note that in section 2, those definitions are soon going to link to later definitions with #535.

fulldecent commented 6 years ago

Simply as a matter of being concise, about 80% of the word uses of "formally" can be removed with no loss of clarity in the text.

The other 20% would probably be more correct with "defined as".

jamesray1 commented 6 years ago

I don't think it matters much either way, I'm happy to leave as is, and I don't think there's any major advantage to removing it or replacing it with "defined as".