hsutter / cppfront

A personal experimental C++ Syntax 2 -> Syntax 1 compiler
Other
5.45k stars 239 forks source link

[SUGGESTION] Make recieving and returning functions from a function easier and safer. #1285

Open feature-engineer opened 3 days ago

feature-engineer commented 3 days ago

Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code? Yes - It will eliminate the security vulnerabilities that arise from defining the arguments incorrectly via the cpp1 syntax - the same things that are eliminated by using in, out, inout, copy, forward and move in regular cpp2 function definitions instead of using the cpp1 syntax.

Will your feature suggestion automate or eliminate X% of current C++ guidance literature? It will make it unnecessary to learn about std::function (or at least unnecessary to learn about the old syntax for function signature, if keeping std::function around is desirable for some reason)

Describe alternatives you've considered. I've recently written a functional (i.e. a function that receives functions and possibly data, and returns a composition of those functions and data.)

The current syntax, as far as I'm aware, is quite confusing, as it is necessary to specify that the arguments as well as the return value as type e.g. std::function(return_val(arg1_type,arg2_type)) - which is inconsistent with the cpp2 syntax and is not intuitive - it makes it necessary to both know about std::function, as well as the (hopefully to be) deprecated cpp1 function signature syntax.

There are three alternatives I thought of, I'll give an example for a functional that receives two functions - the first receives and int and returns an int, the second receives an int and returns a string, and the functional returns a function that receives an int and applies the latter function to the former one, while forwarding the received int to the former one:

  1. Using function type the same way we define a function type in cpp2:
    f: (g: (int)->int, h: (int)->std::string)->(int)->std::string =  
    :(v)->(int)->std::string = h$(g$(v));
  2. Using parenthesis to make it a bit easier to parse:
    f: (g: ((int)->int), h: ((int)->std::string))->((int)->std::string) = 
    :(v)->((int)->std::string) = h$(g$(v));

    or

    f: (g: <(int)->int>, h: <(int)->std::string>)-><(int)->std::string> = 
    :(v)-><(int)->std::string> = h$(g$(v));
  3. Using std::function, but with the cpp2 syntax:
    f: (g: std::function<(int)->int>, h: std::function<(int)->std::string>)->std::function<(int)->std::string> = 
    return :(v)->std::function<(int)->std::string> = h$(g$(v));

The current syntax is quite confusing for people who aren't familiar with cpp1, and would unnecessarily require them to learn the old syntax along with the new:

f: (g: std::function<int(int)>, h: std::function<std::string(int)>->std::function<std::string(int)> =  
return :(v)->std::function<std::string(int)> = h$(g$(v));

This is even more necessary if you consider the benefits of having in, out, inout, copy, forward and move defined for us in the function definitions that we receive and return (left out as default in for brevity in the above examples) - which is one of the main motivators for cpp2 in the first place.

DyXel commented 1 day ago

Ok, so I was playing around a bit and figured we are quite close to something like what is proposed here in Cpp2 already, take a look at this example:

namespace cpp2::impl {
template<typename T, typename Signature>
concept function_like = std::is_convertible_v<T, std::function<Signature>>;
}

signature: type == (_:int) -> int;

f: (g: _ is cpp2::impl::function_like<signature>, h: int) -> int = g(h);

main: () = std::cout << f(:(x: int)->int = x*2, 2) << '\n';

Here we use the C++ concept cpp2::impl::function_like, to match specifically for functions with the given signature, but not depend on a specific type of closure (it can be a lambda, std::function, etc.), for a real implementation I would expect something more elaborate that doesn't depend on std::function, we can then match that in our higher order function.

From that, some syntactic sugar could be add so that this compiles (it would lower to the concept above): f: (g: _ is (_:int) -> int, h: int) -> int = g(h);

I see two issues with this:

@filipsajdak do you think it would be possible for is to match something akin to the concept above? e.g.: f is (_:int) -> int. It might be a useful feature in general.

filipsajdak commented 1 day ago

I think so. Give me a second to check it.

filipsajdak commented 1 day ago

OK, I have checked that — a prototype solution: https://godbolt.org/z/Yj34afb3M (with no bad implicit casts, e.g., double to int).

That requires two changes:

  1. parsing of is - currently x is (something) threat something as value... and in this case, it needs to parse the whole signature of the callable,
  2. after the signature is parsed, it needs to be rewritten to the concept function_like<Callable, ReturnType, Args...> That means from:
    f: (g: _ is (_:int) -> int, h: int) -> int;

    To:

    auto f(auto g, int h) -> int
    requires function_like<CPP2_TYPEOF(g), int, int>
    ;

Side note. According to the standard, the above could be rewritten to:

auto f(function_like<int, int> auto g, int h) -> int;

Unfortunately, some compilers are not good at parsing these.

DyXel commented 1 day ago

Nice! Looks like the way to go to me. Thanks for the investigation Filip!

filipsajdak commented 1 day ago

My solution does not handle generic callables (e.g., generic lambdas or generic functions). I will also consider solving these cases.

filipsajdak commented 1 day ago

I made a change that also accepts generic callable: https://godbolt.org/z/v7vraxPfh

Unfortunately, when at least one argument is generic, we lose control of the implicit cast of other arguments.

filipsajdak commented 16 hours ago

I corrected the prototype: https://godbolt.org/z/nrcGqvoKG

Also, if we have all signatures parsed in cpp2 we can add additional checks for defined types - to avoid implicit casts.

E.g.:

fun([](auto a, brace_initializable_to<int> auto b) { // this blocks implicit cast of second argument
    return "<" + std::to_string(a) + ", " + std::to_string(b) + ">";
});