dotnet / csharplang

The official repo for the design of the C# programming language
11.55k stars 1.03k forks source link

Proposal: New operator %% for positive-result Modulus operations #1408

Closed aaronfranke closed 3 years ago

aaronfranke commented 6 years ago

Note: The title is slightly incorrect because it's hard to describe the behavior completely in a short title. The idea is for a %% b to be on the range [0, b), so a %% -3 would be on the range [0, -3). As usual, the pattern holds that as a increases, so does the output. However, by far the most common case is for b to be positive, and in that case the title is correct.

Currently, C# has the operator % for the Remainder operation. This is different from the canonical Modulus when it comes to negative numbers. For example, -5 % 8 is -5 with the Remainder operation but it is 3 with the Modulus operation (proposed syntax: -5 %% 8 returns 3).

Currently, I implement Modulus on top of Remainder in my program like this:

public static float Mod(float a, float b)
{
    float c = a % b;
    if ((c < 0 && b > 0) || (c > 0 && b < 0)) {
        c += b;
    }
    return c;
}

A new operator, %%, would serve the following purposes:

Having only the wrong operator in the language means that the people will tend to choose it over some cryptic library function for their implementation, and will have bugs in their code. Having both operators will make people aware of the problem. So the advantage in bringing the alternate operator to the language is to promote better coding without sacrificing the C legacy.

A few use cases from the replies below:

TheUnlocked commented 6 years ago

I like the idea of a true modulo operator. I'm not sure %% is the right operator for this task (the doubled remainder symbol could imply that it's a logical operator, like with & and && or | and ||), but some operator should be made to do this.

Joe4evr commented 6 years ago

The title of this proposal doesn't make any sense. The entire point of the modulo operation is to get the remainder that is produced from long-dividing two numbers, so there is no such thing as "true modulo" that isn't obtaining that remainder.

This is different from the Modulus when it comes to negative numbers.

According to Wikipedia, there is no rigid mathematical specification of how modulo should work with negative numbers.

When either a or n is negative, the naive definition breaks down and programming languages differ in how these values are defined.

aaronfranke commented 6 years ago

Let me be specific then: I want a Modulus operator which returns between 0 (incl) and divisor (excl).

Also, Microsoft's own definition of the % is "Remainder" not "Modulus" in their C# docs.

Joe4evr commented 6 years ago

I want a Modulus operator which returns between 0 (incl) and divisor (excl).

Then call Math.Abs() on the result. If you look at the table on the Wikipedia article, the result in C# is defined as having the same sign as the Dividend (left-hand argument).

mikedn commented 6 years ago

I want a Modulus operator which returns between 0 (incl) and divisor (excl).

And you want this operation for what exactly? What is this is the mathematical meaning of this operation? What code is a %% b supposed to generate?

aaronfranke commented 6 years ago

@Joe4evr That does not do what I want, at all. The output value should loop around the bounds. -1 % 5 is 4 with a true Modulus, -1 with Remainder, and 1 with Remainder + Math.Abs().

As I've already stated the work-around is to add the second value if the result is negative. But this is not trivial (well, not as trivial as an operator), and there are compiler optimizations only possible with a true Modulus operation.

aaronfranke commented 6 years ago

Example use case: Let's say you divide a coordinate system into pieces of 100. If you are at position 5, you are at subposition 5 of piece 0. If you are at 230, you are at subposition 30 of piece 2. If you are at position -20, you are at subposition 80 of piece -1.

The reason you're not at subposition -20 or 20 is because the subpositions should always be positive AND increasing when position is increasing. If we were at subposition -20 of piece 0 then piece 0 would be two times as large as any other piece.

I am sure there are many, many more uses for Modulus, but this is what I have in mind.

mikedn commented 6 years ago

I am sure there are many, many more uses for Modulus, but this is what I have in mind.

Are there really so many uses do justify adding a new operator? What's wrong with adding a Mod method to System.Math?

aaronfranke commented 6 years ago

Of course that's up for debate, but I would argue that a new operator is a good idea.

I would also support adding a Mod method to System.Math.

aaronfranke commented 6 years ago

I changed the title to better reflect the issue, but technically it's wrong.

I couldn't fit all this into a short title, but hopefully it gets the point across.

mikedn commented 6 years ago

A programmer may write a % b but someone reviewing the code may not be sure if they wanted a Modulus and never tested negative numbers, or if they did indeed want a Remainder. A method such as Math.Rem would be unambiguous. So too would be Math.Mod and%% (hard to write two % by accident)

It probably makes more sense to do what Java does - add FloorDiv and FloorRem to System.Math, especially if you consider your own example with the coordinate system.

jnm2 commented 6 years ago

@aaronfranke I'm afraid the terminology you're using doesn't seem to be ideal. If people search the term ‘modulus,’ they'll discover that it's a synonym for ‘absolute value.’ Since we're talking about modular arithmetic, the term I've always heard is ‘modulo.’ It seems like the less ambiguous term would be better. (FloorRem is also quite a communicative name.)

Also, specifically, you're looking for the Euclidean variant of the modulo operation. (Or possibly the floored division variant? Can't tell.) C# already has the truncated division variant.

Joe4evr commented 6 years ago

It turns out Eric Lippert blogged about this exact thing back in 2011.

aaronfranke commented 6 years ago

@jnm2 I want what Eric Lippert calls "Canonical Modulus". I don't care if Modulus has conflicting definitions in the mathematical world (nobody in programming calls absolute value as "modulus"). I just want a math operator.

CyrusNajmabadi commented 6 years ago

I don't care if Modulus has conflicting definitions in the mathematical world. I just want a math operator.

Can you not just provide that operator in a library? Why does hte language need first class support for it?

aaronfranke commented 6 years ago

I already have a method for implementing it, but I think it should be in the language because it's very useful and I am not the only one who would like having this functionality.

theunrepentantgeek commented 6 years ago

I think it should be in the language because it's very useful and I am not the only one who would like having this functionality.

Anecdotally, I'm sure you believe this to be true. My own experience tells me the opposite.

Without hard data from a significant percentage of the C# userbase, neither of us can be sure.

aaronfranke commented 6 years ago

@TheUnlocked

I'm not sure %% is the right operator for this task (the doubled remainder symbol could imply that it's a logical operator, like with & and && or | and ||)

There are already other doubled operators which are not boolean logic, such as ++.

aaronfranke commented 6 years ago

@theunrepentantgeek Here is some data from some users: https://stackoverflow.com/questions/10065080/mod-explanation People are confused when they expect positive-result (really sign-of-divisor) answers and get negative answers in C#, and compare to languages like Python where positive-only (really sign-of-divisor) is the standard.

Judging by "Viewed 68k times", and the amount of upvotes (remember that not every viewer of the question would upvote it), I would estimate that this is a fairly common problem. Providing an operator for canonical Modulus in C# would help these users out.

CyrusNajmabadi commented 6 years ago

Providing an operator for canonical Modulus in C# would help these users out.

Having the functionality be available would certainly be useful. That does not mean having the operator is necessary.

For example, there are times i need to do an unsigned right shift. But do i need it to be an operator (like how java has it)? Nope. I just switch to unsigned math, do the right shift, then switch back. Unsigned shifts are still useful. But i can live without it being an operator.

The same is true for this sort of mod/remainder for me. I can totally see the usefulness of it. I've even needed it at times. But that doesn't make me think i needed an operator for it. I would be totally fine with this coming in through the BCL, or just some numerics package.

aaronfranke commented 6 years ago

I would still like an operator, but, I agree with you @CyrusNajmabadi

Richiban commented 6 years ago

I remember in my early days of programming being really confused as to why my code wasn't working... and it's because everyone calls % the "mod operator" (incorrectly so). My extension method I'd written for safe array access (i.e. indexes would wrap around)...

public static class ArrayExtensions
{
    public static T WrapIndex<T>(this T[] source, int index)
        => source[index % source.Length];
}

...works just find for indexes greater than the size of the array but unfortunately doesn't work for negative indexes. To implement this function we need to detect whether the result came out negative and adjust it accordingly:

public static class ArrayExtensions
{
    public static T WrapIndex<T>(this T[] source, int index)
    {
        var wrapIndex = index % source.Length;

        if (wrapIndex < 0)
            wrapIndex += source.Length;

        return source[wrapIndex];
    }
}

So in my opinion there's definitely a need for this functionality. Whether it needs its own operator or not is up for debate. I'd say yes, but that's because 99% of the time I want mod not rem.


If anyone would like a visual representation of the difference between remainder and mod I've made a little diagram:

x        -10 -9 -8 -7 -6 -5 -4 -3 -2 -1  0  1  2  3  4  5  6  7  8  9 10
                          |              |              |
x % 5     -0 -4 -3 -2 -1  0 -4 -3 -2 -1  0  1  2  3  4  0  1  2  3  4  0
                          |              |              |
x mod 5    0  1  2  3  4  0  1  2  3  4  0  1  2  3  4  0  1  2  3  4  0

You can see that for positive numbers the two sequences are the same, but the differ for negative. As we pass through zero (approaching from the positive side) the mod numbers continue the same pattern through zero whereas the rem pattern turns negative and reflects through zero.

aaronfranke commented 6 years ago

Here is another example of when Modulus is more useful than Remainder. Let's say you're writing a program that finds out which day of the week it is. You can accomplish this by taking the amount of days and performing a Modulus by 7 on the days. Then it will always return between 0 and 6, 7 total values, for each day of the week.

Unfortunately, this doesn't work with % if you try to perform math on days of the week before your initial reference day at zero. 5 days before the time when you started counting would be -5, but -5 % 7 would return -5. There is no negative-fifth day of the week, that's just ridiculous. In reality, that would be index 2 of the -1th week, so it should return 2. -5 %% 7 would fix this problem.

mikedn commented 6 years ago

Here is another example of when Modulus is more useful than Remainder

Examples can be found but the problem is that in reality dealing with negative numbers is less common so % works just fine in many cases. And the bigger problem with these examples is that they can be used to also show that the current / division is also not suitable. Your coordinate system example shows just that - -20 / 100 == 0 but in that example you'd need it to be -1. The day of week example can also be turned into a "how many weeks" example and then division would say -5 and 5 are both in the "current week" but you'd probably want -5 to be "last week" or something like that.

It's a bit baffling why everyone keeps taking about this modulus thing, I guess it's because many programmers have heard about modular arithmetic but less know about flooring and truncation.

Also many seems to be miss the fact that the current reminder is defined in relation to division - a % b == a - b * (a / b). While a %% b would be a - b * (a // b).

So what's next? Propose adding // as well? Adding a single new operator is not something that will happen easily, but two? That's not going to happen. Math methods on the other hand, sure why not. They're easy to implement and the names are pretty good at communicated the meaning so there's little chance of confusion (assuming, of course, that developers know what floor is).

aaronfranke commented 6 years ago

I'm not asking floored-Division in this proposal (some languages like Python do have an operator for this: // in which -8//7 is -2) although this would also be nice to have too.

It's more common to need the modulus. You may want to know what day of the week or what sub-position etc but maybe not the sur-position. %% is more important than // IMO but both are nice.

Operators are nice, but new methods in System.Math are also useful. I would be satisfied with that.

CyrusNajmabadi commented 6 years ago

You may want to know what day of the week or what sub-position etc but maybe not the sur-position.

that is not a compelling use case to need an operator for me :)

vladd commented 6 years ago

Yet another place which would profit from having this functionality available: https://referencesource.microsoft.com/#System.Core/System/Collections/Generic/HashSet.cs,1421

return m_comparer.GetHashCode(item) & Lower31BitMask;

Here the mask is needed only in order to workaround the remainder behavior on negative numbers.


I understand that the ship has already sailed, but from my experience remainder is almost exclusively used as modulus, so silent replacing of remainder with modulus would be not mentioned by the majority (modulo some workarounds for negative divisors, which would be just not necessary any more).

mikedn commented 6 years ago

Here the mask is needed only in order to workaround the remainder behavior on negative numbers.

And what problem would removing the & solve?

CyrusNajmabadi commented 6 years ago

Yet another place which would profit from having this functionality available:

Why would you need an operator for this? Why would a library method not be sufficient?

vladd commented 6 years ago

@mikedn Removing & would a little bit speed up an operation which is likely to be called very frequently, especially in LINQ. (No actual measurements done.)

CyrusNajmabadi commented 6 years ago

@mikedn Removing & would a little bit speed up an operation which is likely to be called very frequently, especially in LINQ. (No actual measurements done.)

I'm confused. The & is to make things non-negative so that % can be called later. But if this were a perf problem and you wanted to not have the & and you wanted an operator that did the "keep things positive" remainder approach, why would you need a special operator for thsi? Why would you not just update m_buckets[hashCode % m_buckets.Length] to be m_buckets[Math.Whatever(hashCode, _buckets.Length)] and be ok?

If there is a perf gain to be got, then why not just demonstrate it, and use that to encourage adoption of a user math library function for everyone to have access to? Why do you have to change the language to get this?

vladd commented 6 years ago

@CyrusNajmabadi Well, are we talking about the reality of current C# or theoretical preference?

At the current language state, of course the same goal can be achieved by using a library function. But from theoretical point of view: assume we are designing a language from scratch and have no C legacy constraints; in this case I'd argue that the % should be modulus, not remainder, because in the majority of cases developers want the former and not the latter.

Now back to C#: if we agree on the better usefulness of modulus, then it looks pretty illogical to keep in the core language the less useful one of two, and put the more useful one into a library. So from my point of view it would be profitable to include modulus into the language, thus endorsing the usage of the more useful one.

vladd commented 6 years ago

@CyrusNajmabadi The intention of the example was only to demonstrate that the modulus is more useful than the remainder.

CyrusNajmabadi commented 6 years ago

@CyrusNajmabadi The intention of the example was only to demonstrate that modulus is more useful than remainder.

When needing to map arbitrary int32's to an unsigned index in an array... sure. :)

Now back to C#: if we agree on the better usefulness of modulus, then it looks pretty illogical to keep in the core language the less useful one of two, and put the more useful one into a library. So from my point of view it would be profitable to include modulus into the language, thus endorsing the usage of the more useful one.

Not at all. The litmus to getting in the language is that by doing so you provide a substantively better experience than just going through a library. I would argue C# didn't need % to begin with. But we're a C-lineage language, so it makes sense to at least keep the operators in line with that lineage.

If i can just go and use a library for this, then that's vastly cheaper and easier to provide than needing to go change the language. Furthermore, it benefits all .net languages, not just C#.

CyrusNajmabadi commented 6 years ago

But from theoretical point of view: assume we are designing a language from scratch and have no C legacy constraints;

I don't believe such a discussion is useful. We're not designing a language from scratch, and C# def follows the C-lineage. If someone wants to opine about good choices for future languages, that's fine. I just would not desire that discussion happen here :)

vladd commented 6 years ago

@CyrusNajmabadi

The litmus to getting in the language is that by doing so you provide a substantively better experience than just going through a library. I would argue C# didn't need % to begin with. But we're a C-lineage language, so it makes sense to at least keep the operators in line with that lineage.

I, too, don't believe that discussing impossible scenarios (like getting rid of % completely) would be useful. So we are stuck with having % in the language.

Now, what would be the advantage of having an alternate, almost duplicate operator in the language? Very simple: endorsing the right usage.

Consider the typical use cases for using %. Counting bucket number in hashtable is using %, but indeed needs modulus. Check for odd numbers (n % 2 == 1) needs modulus. [These two examples are stolen from Eric Lippert's article.] Subtracting angles in degrees: diff = (a - b) % 360 is wrong for negative difference. Getting the next index in the ring buffer works fine with both operators, whereas getting the previous index requires exactly modulus.

Having only the wrong operator in the language means that the people will tend to choose it over some cryptic library function for their implementation, and will have bugs in their code. Having both operators will make people aware of the problem.

So the advantage in bringing the alternate operator to the language is to promote better coding without sacrificing the C legacy.

theunrepentantgeek commented 6 years ago

people will tend to choose it over some cryptic library function for their implementation, and will have bugs in their code

And here's the crux of it.

The one data point that I have - my own experience - is that I don't recall ever running into this problem. And I'm speaking as someone who has been programming in C-lineage languages since 1987 (at age 15).

I think there are significant downsides to adding a new operator that is almost-exactly-but-not-quite-the-same as an existing one. That's exactly the sort of thing that leads to confusion and bugs.

aaronfranke commented 6 years ago

@theunrepentantgeek By that logic, what's the point of:

These operators exist because they are useful in different situations. Sometimes & is more useful than &&, sometimes ++ is more useful than += 1, sometimes %% is more useful than %.

mikedn commented 6 years ago

Removing & would a little bit speed up an operation which is likely to be called very frequently, especially in LINQ. (No actual measurements done.)

Except that it won't for many reasons.

vladd commented 6 years ago

@mikedn Ok, this bring the issue on a different level.

I'm not an x86 assembly guru, so I don't know if modulus is directly available in x86 instruction set. However, C# is not bound to x86 architecture any more. If the appropriate processor instruction is indeed not available on x86 and must be emulated, the change won't be useful for BCL platform, but nevertheless would be useful for implementations on platforms which do support efficient modulus.

Absence of IL instruction is not big problem: if rem deserves its dedicated IL opcode, then so does mod.

The same way, the number of cycles at current Intel CPUs is not something which should drive the development of the language. Even with that, gain of 1% in a tight loop is not something which should be outright neglected.


I understand your point: adding the new operator won't bring immediate noticeable performance gain on popular platforms. That's true. But my point is that this operator would help writing more expressive and correct code. [And can bring performance gain if the platform supports efficient modulus, too.]

aaronfranke commented 6 years ago

Another example use case: When computing time, there are no negative times. You can't have negative hours of the day (-3 o'clock would be the same as 9 o'clock because %% 12). You can't have negative days in a month (January -1st doesn't make sense, it should be December of previous year). % works for positive times but not for negative ones.

mikedn commented 6 years ago

I'm not an x86 assembly guru, so I don't know if modulus is directly available in x86 instruction set.

As already stated in my previous post, there are no such instructions on any CPUs (well, at least on the ones that matter - x86/x64/arm32/arm64). x86/x64 CPUs have idiv that also returns the reminder. arm64 and some arm32 CPUs have a sdiv instruction that only returns the quotient. The JIT replaces a % b with a - b * (a / b). Some arm32 don't have even an sdiv instruction, in such cases the JIT calls helper functions for both / and %.

Absence of IL instruction is not big problem: if rem deserves its dedicated IL opcode, then so does mod.

Are you saying that mod should be added to IL? That's simply not going to happen.

But my point is that this operator would help writing more expressive and correct code. [And can bring performance gain if the platform supports efficient modulus, too.]

Well, I was simply replying to your claim that removing & would speed up things. It won't, because %% would generate extra instructions compared to %. If you're lucky, it ends up making no difference. But it's more likely that it would actually get slower.

mikedn commented 6 years ago

Another example use case: When computing time, there are no negative times

So if there are no negative times how come is this a use case?

drdbkarron commented 6 years ago

Calling it a modulus is perhaps incorrect; I just found I needed a wrap or rotation operator to operate n n-dimensions. It should rotate + and - , the modulus fails on negative rotations because negative modulus is undefined. Rotation should never go negative and enable fractional rotations and fractional rotations in fractional dimensions (will cause instability at some angles)

sergiokoo commented 4 years ago

I've been coding for 20+ years and I never needed the version of modulo operator which would return me the negative reminder. Grids, tables, array access, circular buffers, date wrappers - there are lots of use cases where you need a positive only reminder. I would be very surprised to see any practical use-case where negative reminder is needed. Thus, I would vote for %% operator or at least a corresponding method in Math.

daarong commented 4 years ago

I landed here because I was among those that believed % was modulus, not remainder... started googling since it was "buggy".

Oops. In fact I'm rather shocked that I got away with using it as modulus for so long, I'm suddenly concerned about where I may have left bugs. I have been programming in C# since its inception... it's embarrassing that I've not noticed what % really is until today when it went wrong.

I am definitely behind %% because it's shorthand.

Since we're talking use cases, mine: on setup I was auto selecting an item. The index to select may or may not be 'valid', and rarely it may be 'invalid' aka -1. In this case I just wanted it bound to a valid value, so I mistakenly used % operator on the index with list length as divisor.

kshetline commented 4 years ago

What I don't understand is why the way % works with negative numbers, as it does in C# and many, many other languages, is so common. When negative numbers get involved, in ANYTHING I've ever done with time and date calculations, graphics, astronomy, etc. I have NEVER found the default behavior of % in C# (or C, or Java, or JavaScript) at all useful. It's always a nuisance that needs a work-around for me.

I'd love to hear of a real-life use case where this common % behavior with negative numbers is actually beneficial rather than merely tolerated, because I've never run into it.

My suspicion is that the annoying but more common sign handling for % grew out of something that was easier to implement in a CPU instruction set sometime long ago, and has been propagated ever since in the name of compatibility.

I'd throw my support behind this %% proposal too. (I just learned that Python gets % right, which inspired me to do a search on this subject.)

CyrusNajmabadi commented 4 years ago

I'd love to hear of a real-life use case where this common % behavior with negative numbers is actually beneficial

Porting between languages. I don't have to think about how the behavior will change when moving between existing code in many languages and C#, they just behave teh same way.

aaronfranke commented 4 years ago

@kshetline It all started with C, and most other languages copied this same behavior, including C++, C#, Java, JavaScript, GDScript, etc. It was a mistake made in the 1970s that will haunt us for centuries. A few languages have % behaving as the proposed %%, including Python and Ruby. https://en.wikipedia.org/wiki/Modulo_operation#In_programming_languages "Dividend" is the C# behavior, while "Divisor" is the proposed %% behavior (the title is slightly misleading).

kshetline commented 4 years ago

@CyrusNajmabadi, I already said "has been propagated ever since in the name of compatibility", so I recognized the portability argument. My question was clearly about what else besides compatibility does the C# implementation of % have going for it? Why did that behavior become the thing people would now want to be compatible with?

Without a good example of that kind of utility, I'm inclined to go along with what @arronfranke just said, that it's "mistake made in the 1970s that will haunt us for centuries".