dotnet / csharplang

The official repo for the design of the C# programming language
11.41k stars 1.02k forks source link

[Proposal]: Implicit parameters #6300

Open radrow opened 2 years ago

radrow commented 2 years ago

Implicit Parameters

Summary

This proposal introduces implicit parameters for C#. It is highly inspired by a feature of Scala known under this exact name, as well as similar solutions from other languages. In spite of that inspiration, the aim is to make the new functionality as simple as possible in order to avoid numerous flaws caused by overly complex Scala's design. The motivation is to increase clarity and comfort of writing and refactoring code that extensively passes environmental values through the stack.

Implicit parameters are syntactic sugar for function applications. The idea is to pass selected arguments to methods without necessarily mentioning them across the code, especially where it would be repetitive and unavoidable. Thus, instead of writing

Data FetchData(CancellationToken token) {
    var request  = MakeDataRequest(token);
    var service  = GetService("main", token);
    var response = service.Send(request, token);

    if(response.IsNotOk) {
        service = GetService("fallback", token);
        return service.Send(request, token).Data();
    }

    return response.Data();
}

MakeDataRequest(CancellationToken token);
GetService(string name, CancellationToken token);

One could simplify it to something like:

Data FetchData(implicit CancellationToken token) {
    var request  = MakeDataRequest();
    var service  = GetService("main");
    var response = service.Send(request);

    if(response.IsNotOk) {
        service = GetService("fallback");
        return service.Send(request).Data();
    }

    return response.Data();
}

MakeDataRequest(implicit CancellationToken token);
GetService(string name, implicit CancellationToken token);

Note that the cancellation token (token) is provided implicitly to every call that declares it as its implicit argument. While it still needs to be declared in the function signature, the application is handled automatically as long as there is a matching implicit value in the context.

A way to look at this feature is that it is a counterpart of the default parameters that are already a part of C#. In both concepts some arguments are supplied by the compiler instead of the programmer. The difference is where those arguments come from; to find the source of an implicit parameter you need to look at the calling function's signature, as opposed to the called function in case of the default parameters.

Motivation

Since it is just a "smart" syntactic sugar, this feature does not provide any new control flows or semantics that were not achievable before. What it offers is that it lets one write code in a certain style more conveniently, and with less of boilerplate.

The promoted paradigm is to handle environment and state by passing it through stack, instead of keeping them in global variables. There are numerous benefits of designing applications this way; most notably the ease of parallelization, test isolation, environment mocking, and broader control over method dependencies and side effects. This can play crucial role in big systems handling numerous tasks in parallel, where context separation is an important security and sanity factor.

Simple CancellationToken examples like the previous one are likely to be common. The following example is more elaborate, showing a realistic implementation of a gRPC server converting image files:

void CheckCancellation(implicit ServerCallContext ctx) =>
    ctx.CancellationToken.ThrowIfCancelled();

bool FileCached(string fileName, implicit ServerCallContext ctx) =>
    ctx.RequestHeaders.Get("use_cache") && Cache.Exists(fileName);

async Task ConvertJpgToPng(
  int fileSize,
  string fileName,
  implicit IAsyncStreamReader<Req> inStream,
  implicit IServerStreamWriter<Res> outStream,
  implicit ServerCallContext ctx)
{
    bool cached = FileCached(fileName); // !

    Jpg jpg = null;
    if(cached)
    {
        await RequestNoFile(); // !
        jpg = Cache.Get(fileName); // !
    }
    else
    {
        jpg = await RequestFile(); // !
    }
    CheckCancellation(); // !

    Png png = jpg.ToPng();

    await outStream.WriteAsync(new Res(){Png = png}); // !
}

async Task RequestNoFile(implicit IServerStreamWriter<Res> outStream) =>
    await outStream.WriteAsync(new Res(){SendFile = false}); // !

async Task<Jpg> RequestFile(
  implicit IAsyncStreamReader<Req> inStream,
  implicit IServerStreamWriter<Res> outStream,
  implicit ServerCallContext ctx) {
    await outStream.WriteAsync(new Res(){SendFile = true }); // !
    CheckCancellation(); // !
    Req msg = await inStream.ReadAsync(); // !
    CheckCancellation(); // !
    return msg.Png;
}

The code is a lot lighter and arguably cleaner than what it would look like if it passed around ctx, inStream and outStream explicitly every time. The code focuses on the main logic without bringing up the contextual dependencies, which are mentioned only in method headers. To show the impact, I marked all the places where the implicit application happens with a // ! comment.

Implicit parameters ease refactoring in some cases. Let us imagine that it turns out that RequestNoFile needs to check for cancellation, and therefore requires ServerCallContext to get access to the token:

async Task RequestNoFile(implicit IServerStreamWriter<Res> outStream, implicit ServerCallContext _) {
    await outStream.WriteAsync(new Res(){SendFile = false});
    CheckCancellation();
}

Because in the presented snippet RequestNoFile is called only from scopes with ServerCallContext provided, no other changes in the code are required. In contrast, without implicit parameters, every single call to RequestNoFile would have to be updated. Of course, if the calling context does not have that variable, it needs to get it anyway -- but if it does so implicitly as well, this benefit propagates further. This nicely reduces the complexity of adding new dependencies to routines.

Detailed design

General syntax

Since the implicit parameters appear similar to optional parameters, it feels natural to declare them in a similar manner:

void f(int x, implicit int y, implicit int z) {}

Regarding placement, it makes sense to mingle both kinds of special parameters together. Parameters could be also simultaneously implicit and optional as well:

void f(int x, implicit int y, implicit int z = 3, int w = 4) {}

Supplying implicit arguments from non-implicit methods

In order to avoid the mess known from Scala 2, there should always be a clear way of finding the values provided as implicit parameters. Therefore, I propose letting them be taken only:

Hence this:

int f(int x, implicit int y);

int g() {
    return f(3, y: 42);
}

If supplying them manually starts getting annoying, then a possible workaround would be to lift the context with another method. So this:

void f1(implicit int x);
void f2(implicit int x);
void f3(implicit int x);

void g() {
    int arg = 123;
    f1(x: arg);
    f2(x: arg);
    f3(x: arg);
}

turns into this:

void f1(implicit int x);
void f2(implicit int x);
void f3(implicit int x);

void g() {
    int arg = 123;
    gf(x: arg)
}

void gf(implicit int arg) {
    f1();
    f2();
    f3();
}

Overloading

Resolution rules for overloading should be no different that those for optional parameters. When a method is picked based on the contex,t and there is no suitable implicit parameter in scope, it should result in an error.

Nested functions

In most common cases there should be no reason to prevent local functions from using implicit parameters of enclosing methods. Though, there are two exceptions where it would not work:

A workaround for the former is to declare the static function with the same implicit parameter. That also gives a reason to have a casual shadowing regarding the latter case.

Resolution of multiple implicit parameters

The design must consider ambiguities that emerge from use of multiple implicit parameters. Since they are not explicitly identified by the programmer, there must be a clear and deterministic way of telling what variables are supplied and in what order. A common way of tackling this is to enforce every implicit parameter have a distinct type and do the resolution based on that. It is a rare case that one would need multiple implicit parameters of the same type, and if so a wrapper class or a collection can be used (even a tuple).

There is a special case when inheritance is taken into account, as it can lead to ambiguities:

void f(implicit Animal a) {}

void g(implicit Dog d, implicit Cat c) {
    f();  // Which animal should be supplied?
}

This should result in an error, ideally poining to all variables that participate in the dilemma. However, as long as the resolution is deterministic, there should be no issue with that. A workaround in such situations is explicit application:

void f(implicit Animal a) {}

void g(implicit Dog d, implicit Cat c) {
    f(a: d);  // That's clear
}

If that feels doubtful, it could be a configurable warning that an implicit parameter is affected by subtyping.

Backwards compatibility

Since I propose reusing an existing keyword, all valid identifiers shall remain valid. The only added syntax is an optional sort of parameters, which does interfere with any current constructs, so no conflicts would arise from that either. There is also no new semantics associated with not using this feature. Thus, full backward compatibility.

Since there is a general convention to keep contextual parameters last anyway, transition of common libraries to use implicit parameters should be quite painless. That is because implicit parameters can still be used as positional ones, so the following codes shall run perfectly the same:

// Version 1.0 before implicit parameters
async void Send(Message message, CancellationToken token);

// LegacyEnterpriseSoftwareIncorporated's business logic
async void SendFromString(string s, CancellationToken token)
{
    Send(new Message(s), token);
}

and

// Version 1.1 after implicit parameters
async void Send(Message message, implicit CancellationToken token);

// LegacyEnterpriseSoftwareIncorporated's business logic
async void SendFromString(string s, CancellationToken token)
{
    Send(new Message(s), token);
}

...and of course

// Version 1.1 after implicit parameters
async void Send(Message message, implicit CancellationToken token);

// ModernStartupHardwareFreelance's business logic
async void SendFromString(string s, implicit CancellationToken token)
{
    Send(new Message(s));
}

Performance

These parameters turn into normal ones in an early phase of the compilation, thus no runtime overhead at all. Compilation time would be affected obviously, but it depends on the resolution algorithm. If kept simple (what I believe should an achievable goal), the impact should not be very noticeable. More than that, there is no overhead if the feature is not used.

Editor support

Since the feature would be desugared quite early, it should be easy to retrieve what arguments are applied implicitly. Thus, if some users find it confusing, I believe it would not be very hard to have a VS (Code) extension that would inform about the details of the implicit application. A similar thing to adding parameter names to method calls.

Drawbacks

Well, "implicit". This word is sometimes enough to bring doubts and protests. As much as I personally like moving stuff behind the scenes, I definitely see reasons to be careful. All that implicit magic is a double-edged sword -- on one hand it helps keeping the code tidy, but on the other can lead to nasty surprices and overall degraded readability.

One of the most common accusations against Scala is the so-called "implicit hell", which is caused by sometimes overused combination of extension classes (known there as, of course, "implicit" classes), implicit parameters and implicit conversions. I am not a very experienced Scala programmer, but I do remember finding Akka (a Scala library that uses implicits extensively) quite hard to learn because of that.

As mentioned before, there is an article by Scala itself, that points out flaws in the Scala 2 design. I encourage the curious reader for a lecture on how not to do it.

Also, there is a discussion under a non-successful proposal for adding this to Rust. The languages and their priorities are fairly different, but the critics there clearly have a point.

Alternatives

Resolution by name

Implicit parameters could be resolved by name instead of types. It allows implicit parameters to share type and solves all issues with inheritance, since types wouldn't play any role here. Although, it reduces flexibility since the parameters would be tied to the same name across all the flow of the code. This may slightly harden refactoring. A counterargument to that is that each implicit parameter should generally describe the same thing everywhere, so keeping the same name feels natural anyway and looks like a good pattern that might be worth enforcing.

Local implicit variables

To ease resolution and reduce the amount of code, some local variables could be declared as implicit as well. To avoid Scala 2 mess, it is important to allow this solely for method-local parameters and nothing more.

void f(implicit int x);

void g() {
    implicit int x = 123;
    f();
}

Unresolved questions

Design meetings

dmchurch commented 3 months ago

@MadsTorgersen I think you're still the champion for this - do you or @radrow have any thoughts on my formulation above?

radrow commented 3 months ago

That's elaborate! You brought some very good points here.

Touching on the operators, while the modulo arithmetic example sounds a bit superfluous (I think I would rather have a struct for ints in the "Modulo Land"), I completely agree on the == operator. At the moment, in many cases it seems almost an anti-pattern to ever use == on strings instead of the formal String.Equals(String, String, StringComparison), for instance. Moreover, here we already have some sort of implicit parametrization in the form of Thread.CurrentThread.CurrentCulture.

One difference I see between steering == on strings and the base of modulo arithmetic for ints is that the latter belong to different algebraic structures. They are sort of different animals. By that, I enjoy restricting the freedom of conversions between modulo-ints of different bases.

int hour = 7;

implicit (int hoursPerDay = localPlanet.HoursPerDay)
{
    hour += 21;
}

implicit (int hoursPerDay = localPlanet.Moons()[0].HoursPerDay)
{
    // I quite don't like this. `hour` used to "live" in localPlanet,
    // but now it was dragged "to the moon"
    hour += 30;
}

However, strings are just UTF16 arrays regardless of which strategy is taken for comparison. Because of that, implicit parameterization applies "correctly" as solution to your second example, as it is the operation that is tweaked, not the domain. At least, this is how I view it.

void SignUp(String name)
{
    implicit(StringComparison comparisonType = CurrentCultureIgnoreCase)
    {
        if(name == "Radek")
        {
            throw new WeDontLikeRadekException();
        }
    }

    implicit(StringComparison comparisonType = Ordinal)
    {
        for (other in database.names)
        {
            if(name == other) return;
        }
        database.insert(name)
    }
}

Regarding your mention of checked (which I see for the very first time), it looks like precisely an instance of implicit parametrization I am proposing here. The only difference I can immediately see (except being a hardcoded case) is that you can't declare a method to inherit it from the caller's context. The closest (but still far) you can do is to set it globally in the compiler. Additionally, I don't think operators should be considered special in any way here — why would x + x be allowed benefit from implicit contexts, while a custom .factorial() method would not? To me, the solution I propose makes it paradoxically more explicit than that, by enforcing declaration of the implicit context at the method's definition, which in case of checked is only mentioned in the language specification (if you have the right C# version, of course). Thank you for bringing this up, I didn't know about it.


I think the syntax you've shown is very clear. Since we have one-line using statements which span their scope to the end of the outer scope, I would also consider doing that with implicit too:

implicit int hoursPerDat = localPlanet.HoursPerDay;
return currentHour + hoursToWind;

Additionally, it may make sense to allow combining using with implicit, just like you can do await using:

using implicit (var context = new Context()) { ... }

// or maybe?

using (implicit var context = new Context()) { ... }

I like your proposal for the resolution. The description seems a bit convoluted as a wall of text, but it feels very intuitive and simple after understanding. I appreciate that you considered ambiguity and mentioned treatment of unused arguments. I do not see any immediate flaws in what you presented.

Regarding forwarding implicitness of implicit parameters, I think it might not be necessary if you use the single-line declarations I mentioned above. At least for the start it would not hurt to write

public Hour Add(int hoursToAdd, implicit int hoursPerDay, implicit bool isChecked = false, implicit bool notifyWatchers = false)
{
    implicit var hoursPerDay = hoursPerDay;
    implicit var isChecked = isChecked;
    implicit var notifyWatchers = notifyWatchers;

    ...
}

or maybe even

public Hour Add(int hoursToAdd, implicit int hoursPerDay, implicit bool isChecked = false, implicit bool notifyWatchers = false)
{
    implicit hoursPerDay;
    implicit isChecked;
    implicit notifyWatchers;

    ...
}

Although, the where clause might be a better place for that.

I have no strong opinions on the ref and in parameters, but I see not reasons against. I find your example with success quite neat, resembling a bit the use of typical state transforming monads. I don't think there is much danger associated — in the end, you can always do your references by manually boxing any data.

dmchurch commented 3 months ago

Thanks for your feedback! Yes, I agree about this being equally useful for methods and for operators; while most of my post talks about operators, I consider the use cases very close to identical. Also, I agree that the first example is a bit contrived - normally I'd like to store modulo bases with the number they're attached to and prevent accidental conversion between bases, too. (That said, I can still see use cases for the implicit-base pattern, like if you're using an "alarm clock" asset made by someone else in your game and it can only store an int for the hour field.)

You're right that, given the one-line using syntax, we probably ought to allow a one-line implicit syntax; I'm not sure how I feel about using that syntax myself (I like having the visual indication of having the extra indentation), but I agree it's better not to violate expectations, and for all I know, once the feature lands I might start writing one-line implicit statements all over the place 😅

Especially since, yes, it makes perfect sense to combine implicit with using or await using! The use cases abound. I'd be inclined to put the implicit keyword outside the parentheses with the using keyword, preferably after: using implicit or await using implicit. I don't have any opinion on whether the reversed syntax implicit using should be a syntax error or just an alternate (perhaps discouraged) syntax.

As for your last example about forwarding implicit state, my gut reaction was to say "I already specified that" in the "Syntax 2a" example, but that was before I realized that (a) that's for the multi-line form of the statement, and (b) from the pattern established by using, the parentheses should not used for the single-line form. I'd suggest the following, inspired by the fact that a single var keyword is allowed to substitute for multiple types in its var (a, b, c) = (false, 1, "two"); deconstruction form:

public Hour Add(int hoursToAdd, implicit int hoursPerDay, implicit bool isChecked = false, implicit bool notifyWatchers = false)
{
    implicit var hoursPerDay, isChecked, notifyWatchers;

    ...
}

And, while thinking about it, I'd personally be inclined to allow a programmer to use deconstruction form to declare implicit arguments of multiple types, in both the one-line and multi-line syntaxes:

implicit var (hoursPerDay, isChecked, notifyWatchers) = (24, true, false);

// OR

implicit (var (hoursPerDay, isChecked, notifyWatchers) = (24, true, false))
{
    // ...
}

I'd only allow the var form of deconstruction declaration, though, since otherwise the following:

implicit (int hoursPerDay

is ambiguous. Is that the start of a multi-line declaration of implicit ints, or is it the start of a single-line implicit declaration for a deconstruction, where the first deconstructed element is an int? By requiring use of the var deconstruction declaration rather than the explicitly-typed deconstruction declaration, the syntax becomes unambiguous.

Of course, this syntax wouldn't be compatible with the using or await using keywords... unless the C# team decided to use this opportunity to allow deconstruction syntax with using 😄