erlang / otp

Erlang/OTP
http://erlang.org
Apache License 2.0
11.15k stars 2.92k forks source link

Add is_external_fun/1 and is_local_fun/1 guards #8554

Open starbelly opened 3 weeks ago

starbelly commented 3 weeks ago

Is your feature request related to a problem? Please describe.

No problems.

Describe the solution you'd like

The code exists in beam utilities to discern whether a fun is an external fun ref or a local fun. It would be quite nice to have these as guards. Said guards would mostly be convenience since we can get this information from erlang:fun_info/1 today. Though, there may be performance benefits to having a simple guard vs what erlang:fun_info/1 returns.

Describe alternatives you've considered

Simply use erlang:fun_info/1 in the body of a function required to check whether a function is external or local.

Additional context

In some situations one must validate that function is not a local function in order to ensure this is not slung around between nodes, as this can be a problem in a mixed cluster with different versions of OTP. In my case, we need to ensure callback functions in configuration are only ever external funs for this reason.

Additionally, I'd be happy to do the PR, but I figured an issue regarding the feature was appropriate first.

okeuday commented 3 weeks ago

Creating only the guard function is_function_external/1 would avoid using "local" for use that doesn't relate to Distributed Erlang nodes.

I still have hope that Erlang/OTP might fix the fact that node/0 should not be a guard function, due to it not being a referentially transparent function (creating an error in the draft language specification). If that language bug gets fixed, that would likely lead to guards like is_local_pid/1, is_local_port/1, is_local_reference/1, is_remote_pid/1, is_remote_port/1, is_remote_reference/1 (as described at https://github.com/erlang/otp/issues/5568).

starbelly commented 3 weeks ago

Additionally note, while I put is_external_fun/1 in the issue name, this may be too broad, my original thought was is_ext_fun_ref/1 it could also be is_external_fun_ref/1. Also, If we just had guard to to indicate whether a term was a ext fun ref or not, the need for is_local_fun/1 becomes less useful (at least I have no use case for this).

jhogberg commented 3 weeks ago

I agree that this would be useful. I'd like to change the names however (and not just here, though the ship might have sailed already) as "local fun" and "external fun"/"remote fun" might make people think about distribution, when it's more about how the fun relates to its module: the former being a fun tied to a specific module and the latter a fun tied to an MFA.

michalmuskala commented 3 weeks ago

Yes, I agree this would be very helpful to have, and I also agree that we could likely do better with the naming.

I also thing this should be coupled with having more places in the system use such funs instead of MFA tuples - with the recent changes they are smaller in memory, and also integrate far better with various analysis tools - typecheckers, dialyzer, static analysis, IDE tooling (go to definition navigation, etc), linting... Using actual closures though in some places is not desirable since they prevent code upgrades and cross-node evaluation to work transparently (unlike MFA tuples), so being able to easily verify and specify what type of function is desirable would be useful in those cases.

RaimoNiskanen commented 3 weeks ago

What about is_export_function/1,2 (arity 2 for the fun's arity, compare to is_function/1,2) since it relates to an module's export table and not the Erlang Distribution protocol.

There are, as said, important cases where you need the restriction that a fun() has to be an export entry fun, but rarely the opposite. Therefore I think that the module local variant isn't needed, since it in those rare cases be accomplished by combining is_function/1,2 and is_export_function/1,2.

okeuday commented 3 weeks ago

@RaimoNiskanen is_function_exported/1,2 seems more natural to me, but I also always try to use common prefixes to group ideas (when I am allowed to). Either way, it sounds like a good approach.

jhogberg commented 3 weeks ago

I don't think it's common here. A "local" fun is not tied to a particular instance of a node, and works just fine everywhere as long as the required code is available. "Local" refers to the fun's relation to a specific module, which is relevant in non-distributed scenarios (code upgrades, funs persisted to disk or similar).

michalmuskala commented 3 weeks ago

I'd float some ideas is_free_function or some sort of super-verbose is_safely_shareable_function; or opposite is_module_bound_function or just is_bound_function

RaimoNiskanen commented 3 weeks ago

Floating on...: is_function(F, export), is_function(F, export, Arity). Or promote fun_info/2 into a guard so you can write fun_info(F, type) =:= external.

starbelly commented 2 weeks ago

I like the idea of promoting fun_info/2 into a guard, as it covers cases in the future that might not be thought of today. I do rather like is_bound_function/1 more or something along those lines as it is succinct and can be documented simply (i.e., takes a function reference, returns true if it is bound to a module, otherwise false).

starbelly commented 2 weeks ago

Coming back to this, I think is_bound_function/1,2 doesn't tell the whole truth here, I believe we can technically say funs evaluated via erl_eval are bound to that module, as such is_export_function/1,2 or is_exported_function/1,2 makes more sense. Maybe there's more ideas...

michalmuskala commented 2 weeks ago

I was thinking about this a bit more. I think using the term "exported" will be problematic given the semantics of the already-existing erlang:function_exported/3. In particular, it verifies the function actually exists (and the module is loaded). In this case (I presume) we want a guard that succeeds even if the module is not loaded and doesn't actually check the function exists - only that it's the "external" format of the fun syntax.

starbelly commented 2 weeks ago

@michalmuskala I agree with that and your presumption at least as far as what I had in mind was definitely not to check if the code was loaded.

okeuday commented 2 weeks ago

I think using the term "exported" is valid because it is describing a future expectation that the function must be exported to be used, otherwise it will be considered undefined (i.e., an undefined function error occurs). The existence of the erlang:function_exported/3 does make the situation a little unusual if a guard function like is_function_exported/1,2 exists too, but the documentation can describe how the is_function_... guard function is only checking the function type and not whether the function's module is loaded.

starbelly commented 2 weeks ago

@okeuday I definitely don't think you are wrong here, especially coupled with what @RaimoNiskanen stated, that this is related to the export table on a module. I think part of it for me is what's easier on the eyes and would what result in less ambiguity, the head of clause should be enough to disambiguate.

What about a happy medium between promoting fun_info/2 to a guard and is_exported_fun/1 (or some variation of), such is is_fun_type/2 ? I'd rather have a single arity guard, but that would be nicer than having to do a abs equal comparison in the case fun_info/2 was promoted to a guard.

okeuday commented 2 weeks ago

@starbelly People haven't liked fun_info/1 in the past. The documentation says it is only meant for debugging and the function returns more than you normally need, in proplist format due to its age while most people would likely prefer a map now. Either way, the return value of fun_info/1 isn't friendly for guard expressions. The fun_info/1,2 type value as local or external isn't common and isn't represented in a type (it shouldn't be used more than it is meant to be used, for debugging). That makes fun_info/2 a bad choice as a guard, partially due to its association to fun_info/1.

That is why I saw this as a new guard function to associate with the existing is_function/1,2 guard functions, except providing true for a subset of the types that are true for is_function/1,2. That would make it a variation on the is_function/1,2 guard functions, so to communicate that variation I think it is best to use a suffix describing the variation. The simplest name using common terminology seems to be is_function_exported/1,2 because -export([...]). is always present in a module (is_function_export/1,2 is possibly not treating export as a verb and it is easier to understand export in the past tense, in the is_function_... use).

Using suffixes to describe variation is a little odd for people that want to write function names as English sentences because using suffixes for the variation should be equivalent to a subject-object-verb word order. Using subject-object-verb word order is most common among natural languages though, so it should be how most people naturally think.

starbelly commented 1 week ago

@okeuday I don't disagree with what you've stated, but I'm still wondering if there is a better name that doesn't "run together" with what exists already. Needs a little more thought, I do believe, this would be something that sticks for a long time after all 😄

starbelly commented 1 week ago

Another idea is_named_function/1

okeuday commented 1 week ago

@starbelly A named function could be a local function, so the "named" word doesn't help to distinguish between exported functions and local functions.

starbelly commented 1 week ago

@starbelly A named function could be a local function, so the "named" word doesn't help to distinguish between exported functions and local functions.

Yes, I thought about this after I shared the idea, and ambiguity therein. So, I think we need to be clear about the "export" part, at least I can not think of something that is quite ambiguous, sans what @michalmuskala suggested around is_safe_and_shareable_will_not_blow_up_in_your_face_as_long_as_the_module_and_function_is_there_regardless_of_the_version/1

I'd like to hear from OTP team at this point 😄

okeuday commented 1 week ago

@starbelly Using is_function_atomic or is_function_updatable would avoid using exported, but the function names seem a bit less intuitive.

starbelly commented 1 week ago

I think export / exported is fine, and it makes sense as that's precisely what we're talking about. I think ideally we'd have a name that conveys something more, but thus far I don't think anyone has come up with something that is non-ambiguous and sends a signal to the user.