Consider renaming CLMUL to XMUL

jhauser-us commented 5 years ago

I request that instruction CLMUL, standing for "carry-less multiply", be renamed to XMUL, for "XOR multiply", meaning a multiplication where the partial products are summed by bitwise XORs instead of the usual additions. My reason is simply that I find the name "carry-less multiply" to be awkward, and I'm probably not alone. The name "carry-less multiply" appears not to be entrenched except in connection to x86 processors.

I attempted a search, and as far as I can tell, the "CLMUL" name has been adopted as part of a standard ISA only for the x86 (with SIMD instruction PCLMULQDQ). The B extension draft notes that the equivalent SPARC instruction is XMULX, officially documented as "XOR multiply". Most Web references to "carry-less multiply", "carry-less multiplication", and "carry-less product" seem to point back one way or another to Intel's CLMUL instruction. Also, there appears as yet to be no __builtin_clmul in GCC. The path should therefore be clear for us to choose the name XMUL if others agree with me it would be preferable.

cliffordwolf commented 5 years ago

My reason is simply that I find the name "carry-less multiply" to be awkward

And I find it awkward to name it anything but CLMUL.

jnk0le commented 5 years ago

The literature usually refers to it as "carry-less arithmetic", "carry-less multiply" rather than "XOR arithmetic", "XOR multiply", so CLMUL name is more justified.

jhauser-us commented 5 years ago

The literature usually refers to it as "carry-less arithmetic" [...]

Okay. Where is this literature? In a quick search, I wasn't able to find any independent sources, only those either written by Intel or specifically discussing uses for Intel's CLMUL instruction. Can you provide a reference?

For example, Wikipedia has a page for "carry-less product", but it's only supporting reference is an Intel white paper: "Intel Carry-Less Multiplication Instruction and its Usage for Computing the GCM Mode". That paper has 18 references, one of which mentions a "carry-less multiplier" in the title, but then we find it's written by the same Intel authors. For all I know (I don't know), Intel also authored the Wikipedia page.

cliffordwolf commented 5 years ago

Look, everyone I know who is doing work on this calls it carry-less multiplication in their work. I always knew it as carry-less multiplication. The RISC-V crypto workgroup is also calling it carry-less multiplication.

In the end it doesn't matter if Intel has established the terminology. It's the terminology everyone is using now and there is nothing to gain by using a different terminology that nobody is familiar with.

From what you write I feel pretty confident guessing that you have not used a lot of GF(2^n) arithmetic before. But just because you are not familiar with something doesn't mean nobody else is. So please stop wasting everyone's time.

jnk0le commented 5 years ago

I have rechecked some of the papers and yes, they look likely to be inspired by Intel, or at least they are referencing intel instructions.

If "carry-less" keyword is missing then we find just plain "multiplication" name (implicitly in finite field), sometimes a "carry-free arithmetic,", "carry-free multiplication" (like [1] from 2001). I don't think that anybody would like the CFMUL (carry-free) or FFMUL (finite field) names, if we wanted to go full NIH.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.9819&rep=rep1&type=pdf

cliffordwolf commented 5 years ago

I don't thik that anybody would like the CFMUL (carry-free) or FFMUL (finite field) names, if we wanted to go full NIH.

Yes.

FFMUL would be highly misleading imo because finite fields are GF(P^N) for any prime P and any positive integer N, but the carry-less multiply instruction is for GF(2^N) only. (This is not just hypothetical. There are a lot of finite field applications that use a large prime P and N=1, and they just use (bigint) multiply and modulo.)

I don't think there would be any advantage to naming it CFMUL over CLMUL, considering they essentially are saying the same, just using different synonyms for "without", and CLMUL is the name that's already being used by virtually everyone today.

jhauser-us commented 5 years ago

jnk0le wrote:

I have rechecked some of the papers and yes, they look likely to be inspired by Intel, or at least they are referencing intel instructions.

Thanks for taking the time to review.

Clifford:

CLMUL is the name that's already being used by virtually everyone today.

Maybe so, but here's what I don't like about the name:

The "carry-less" name arises as an answer to the question, how does this multiplier differ from an ordinary multiplier? But that's an implementation concern, something only hardware implementors should worry about. For users, the name of a new instruction ought to answer a different question: What operation does this instruction do for me?

Look at all the applications people have for CLMUL, from CRCs to Galois/Counter Mode (GCM). When people describe those applications in the abstract, they never speak of carry-less additions; it's always XORs. "Carry-less" only comes into the picture when people start implementing their applications using the CLMUL instruction.

I'm sure you know, RISC-V already has an instruction for "carry-less add". What's it called? XOR, i.e., bitwise XOR. In the history of computers, has there ever been a CLADD instruction? Why not? People could have thought of bitwise XOR as carry-less addition, but somehow they didn't.

I had basically the same argument earlier concerning an instruction called PAUSE (x86's name) versus YIELD (ARM's name), which I also lost. The name YIELD is easily a better fit for what the instruction does, no doubt why ARM chose it, much as SPARC went with "XOR multiply" instead of following Intel's lead with CLMUL. But our guys have insisted on calling the RISC-V instruction PAUSE, because, you know, x86 is ubiquitous and Must. Follow. The. Crowd.

Fortunately, this issue was summarily closed to avoid endangering others with any subversive ideas. Good work Clifford! Back to the re-education camp for me.

jhauser-us commented 5 years ago

For the CLMUL notes, you might add that the Texas Instruments C6000 series had an XORMPY (XOR multiply) instruction in 2008.

pdonahue-ventana commented 5 years ago

On Thu, May 30, 2019 at 11:02 AM John Hauser notifications@github.com wrote:

I'm sure you know, RISC-V already has an instruction for "carry-less add". What's it called? XOR, i.e., bitwise XOR. In the history of computers, has there ever been a CLADD instruction? Why not? People could have thought of bitwise XOR as carry-less addition, but somehow they didn't.

Alpha defines these instructions: ANDLogical Product BIC Logical Product with Complement BIS Logical Sum (OR) EQV Logical Equivalence (XORNOT) ORNOT Logical Sum with Complement XOR Logical Difference

I had basically the same argument earlier concerning an instruction called PAUSE (x86's name) versus YIELD (ARM's name), which I also lost. The name YIELD is easily a better fit for what the instruction does, no doubt why ARM chose it, much as SPARC went with "XOR multiply" instead of following Intel's lead with CLMUL. But our guys have insisted on calling the RISC-V instruction PAUSE, because, you know, x86 is ubiquitous and Must. Follow. The. Crowd.

Fortunately, this issue was summarily closed to avoid endangering others with any subversive ideas. Good work Clifford! Back to the re-education camp for me.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/riscv/riscv-bitmanip/issues/24?email_source=notifications&email_token=ALVQ7MKSIWLYNVIQXCNFTZLPYAJBZA5CNFSM4HQSP53KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWTBUFY#issuecomment-497424919, or mute the thread https://github.com/notifications/unsubscribe-auth/ALVQ7MLOJJ56AWSPRJZFFYDPYAJBZANCNFSM4HQSP53A .

cliffordwolf commented 5 years ago

When people describe those applications in the abstract, they never speak of carry-less additions; it's always XORs. "Carry-less" only comes into the picture when people start implementing their applications using the CLMUL instruction.

That's simply a gross misrepresentation.

In a mathematical context people just use the terms "product" and "sum", "multiplication" and "addition", without any XOR, or Carry-Less, or whatever prefix added. It's clear what operation is meant from the context that defines the algebraic structure that is being are acting on. What assembly instructions those operations map to is of no concern in such texts. But here are some examples:

If it's integers then you'd simply use MUL and ADD.

If the algebraic structure if GF(P) then multiplication would be MUL followed by MOD and addition would by ADD followed by MOD.

If the algebraic structure is binary polynomials then multiplication would be CLMUL and addition would be XOR.

If the algebraic structure is GF(2^N) then multiplication would be CLMUL, followed by polynomial division that's usually implemented using a Barret reduction, which is two more CLMUL instructions. Likewise addition would be XOR followed by a Barret reduction, i.e. one XOR and two CLMUL.

You could even define something like an additive ring that's using CLMUL in the implementation of its addition operator.

In a non-mathematical "bitmanipulation hacks" context people usually don't talk at all about what the corresponding "addition" operator would be.

In the history of computers, has there ever been a CLADD instruction? Why not? People could have thought of bitwise XOR as carry-less addition, but somehow they didn't.

Seriously!? That's your argument?

Of course people have though of XOR as carry-less addition for a long time! Where do you even get those ideas? The standard phrase describing a XOR gate that you find in many textbooks is "modulo 2 addition", i.e. a half-adder without the carry output.

But the established names for binary gates are AND, OR, XOR, etc. and therefore it would be tremendously stupid to call the instruction anything but XOR when the term most people are familiar with is XOR. Likewise it would be tremendously stupid to call the CLMUL instructions something else when the name people are most familiar with is CLMUL.

(Yes, I know that EOR is also relatively common both for the gate and for the bitwise operator.)

I find your Gish galopping offensive. You are wasting my time. It might just take you a few seconds to come up with those nonsensical statements, but it takes me significantly longer to debunk them. Just for you to completely ignore everything I wrote and follow up with the next half-truth or misrepresentation. I suspect that you are smart enough to be fully aware of all that and are using this as a deliberate strategy. And to that I take offense.

I had basically the same argument earlier concerning an instruction called PAUSE (x86's name) versus YIELD (ARM's name), which I also lost. The name YIELD is easily a better fit for what the instruction does, no doubt why ARM chose it, much as SPARC went with "XOR multiply" instead of following Intel's lead with CLMUL. But our guys have insisted on calling the RISC-V instruction PAUSE, because, you know, x86 is ubiquitous and Must. Follow. The. Crowd.

Yes. I'm sure there is a grand conspiracy.

For all I know (I don't know), Intel also authored the Wikipedia page.

An even if they did, you still failed to produce any argument why that would be bad. And why it would be bad to use a well-established name just because you think Intel had a hand in creating the name.

If you google for CLMUL you'll find that wikipedia page, blog posts, scientific papers, and all kinds of other informative materials about what can be done with this instruction. That's the best argument for using this name, not against it.

Fortunately, this issue was summarily closed to avoid endangering others with any subversive ideas. Good work Clifford! Back to the re-education camp for me.

Sure, you and your subversive plot of sticking it to Intel by not using "their" names. The only thing standing between you and your victory over the faceless corporate oppressors is my unfair censorship of your revolutionary ideas.

But very well. Let's waste more peoples time. I'll re-open the issue and point the mailing list to this.

allenjbaum commented 5 years ago

I do have to agree that spending a lot of time arguing about a mnemonic - not the semantics, but the mnemonic, is a waste of time.

I really don't care much about the pedigree; I care more about readability and understandability. Frankly, I don't like either choice because I don't think either would be understandable to someone who comes across it for the first time. (e.g.The "CL" could be mistaken for compressed, Low order, the "X" could be mistaken as eXtended.)

Just to be clear - I don't have a better choice either.

But if pedigree makes it more understandable, (e.g. Translation Lookaside Buffer -huh? - vs. Translation Cache. But, everybody knows what a TLB is, so that acronym is what is pretty universally used) then pedigree should be an important consideration of a mnemonic.

That's different than what would be important for the semantics of the instruction. But arguing about this is a total waste of time, and won't advance the adoption of RISC-V or the B-extension in the slightest - and that's where your efforts should be focused.

jhauser-us commented 5 years ago

Allen Baum wrote:

I do have to agree that spending a lot of time arguing about a mnemonic - not the semantics, but the mnemonic, is a waste of time.

The group is well within its rights to decide on that basis, if that's what the majority wants. I have no vote in the matter.

I maintain, however, there was no need to immediately close this issue with hostility in order to reach such a conclusion. That's happened more than once, and I ask that the temptation be resisted in the future. Contrary feedback should not be viewed as a threat.

The title of this issue is to "consider renaming" an instruction. I believe I asked politely. If the group considers the suggestion but rejects it, the matter can be appropriately closed. That's how it should work.

jhauser-us commented 5 years ago

Allen Baum wrote:

Frankly, I don't like either choice because I don't think either would be understandable to someone who comes across it for the first time. (e.g.The "CL" could be mistaken for compressed, Low order, the "X" could be mistaken as eXtended.)

XORMUL perhaps?

allenjbaum commented 5 years ago

I certainly like it better than XMULX. Here we're trading off readability from (possibly, but not me) what others who have encountered this before would find familiar. I don't have strong feelings either way.

On Fri, May 31, 2019 at 12:10 PM John Hauser notifications@github.com wrote:

Allen Baum wrote:

Frankly, I don't like either choice because I don't think either would be understandable to someone who comes across it for the first time. (e.g.The "CL" could be mistaken for compressed, Low order, the "X" could be mistaken as eXtended.)

XORMUL perhaps?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/riscv/riscv-bitmanip/issues/24?email_source=notifications&email_token=AHPXVJVRNCWZWRF3GB7WU53PYFZ3DA5CNFSM4HQSP53KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWWEB3A#issuecomment-497828076, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPXVJT6NADRBAMZ27G5MR3PYFZ3DANCNFSM4HQSP53A .

riscv / riscv-bitmanip

Consider renaming CLMUL to XMUL #24