backdrop / backdrop-issues

Issue tracker for Backdrop core.
144 stars 39 forks source link

[DX] module_invoke_all() can have surprising results thanks to array_merge_recursive() #6317

Open klonos opened 9 months ago

klonos commented 9 months ago

This came up in https://github.com/backdrop/backdrop-issues/issues/6121#issuecomment-1836998285 and it turns out to be a known issue: https://www.drupal.org/project/drupal/issues/2865599

Background

array_merge_recursive() is a strange function. See e.g. #791860: array_merge_recursive() is never what we want in Drupal: add a drupal_array_merge_recursive() function instead. For Drupal 8, we already replaced many (or all?) instances of array_merge_recursive() with NestedArray::mergeDeep(). This did include ModuleHandler::invokeAll(). See #1356170: Remove all uses of array_merge_recursive, or document why they are being used instead of NestedArray::mergeDeep().

These changes are too big to backport them to Drupal 7 all at once. Instead, individual parts of this change should be discussed in separate distinct issues for Drupal 7. Each will have its own consequences and tradeoffs, and each should have its own patches and tests.

This one deals with module_invoke_all(), which currently uses array_merge_recursive().

Problem

Consider a site with modules 'aa' and 'bb'. Both of them implement hook_permission(), like so:

function aa_permission() {
  return array(
    'open the door' =>  array(
      'title' => 'Open the door to the garage.',
    ),
  );
}

function bb_permission() {
  return array(
    'open the door' =>  array(
      'title' => 'Open the door to the barn.',
    ),
  );
}

Expected behavior:

The result of module_invoke_all('permission'); for these two modules should be:

array (
  'open the door' => array (
    // Some kind of priority rule to decide which one to pick.
    'title' => 'Open the door to the barn.',
  ),
)

Actual behavior:

The result of module_invoke_all('permission'); for these two modules will be:

array (
  'open the door' => array (
    'title' => array (
      0 => 'Open the door to the garage.',
      1 => 'Open the door to the barn.',
    ),
  ),
)

One example of this bug is documented here, #2852809: Facetapi module now implement hook_i18n_string_info Bugs of this type can manifest as "function xyz requires a string as first argument, array given", and most users will have no idea why this is happening.

One could argue that developers are not supposed to use the same key twice. But there needs to be a robust behavior and possibly error reporting for this case.

Additional thoughts

module_invoke_all(), like ModuleHandler::invokeAll(), is implemented as a one-size-fits-all function. In many cases, such as hook_menu() or hook_permission(), the recursive merging of return values is not needed, or rather detrimental than useful. And some calls to module_invoke_all() do not need a return value at all.

The recursive merging is needed for cases like hook_token_info(), and maybe some others.

One can ask why all of these cases need to use the same function.

Possible solutions

(At the time of writing this, I am not advocating one specific solution, I am just putting them as options on the table)

  • drupal_array_merge_deep() In Drupal 8 ModuleHandler::invokeAll(), the array_merge_recursive() was replaced with NestedArray::mergeDeep(). The equivalent thing in Drupal 7 would be drupal_array_merge_deep().

    Possible problems: This changes the behavior of module_invoke_all(), which might make it a BC break. This needs some discussion or investigation.

  • Error reporting Another option could be to not change the behavior at all, but detect if there is a clash and report it to the watchdog. E.g. compare the result of the old module_invoke_all() with a new version that uses drupal_array_merge_deep(), and log a warning if the results are different. I do not think this is a good idea at all.

  • Ignore the problem This is the easiest "solution". Just say it's the developer's fault who used the same key twice.

  • Separate dedicated functions Alternative versions of module_invoke_all() could be provided that treat the return value differently. This "solution" would require that calling code actually uses these alternative functions. It would be a long time until this would fix any problem. If we were to follow this idea, it should rather be in a separate issue, with D8 first. I am still mentioning it here for the sake of completeness.

klonos commented 9 months ago

OK, I have taken a stub at this: https://github.com/backdrop/backdrop/pull/4590

What this PR does:

  1. It introduces a new module_invoke_all_merge_deep() function. Initially, I have duplicated the code of module_invoke_all() in it, but the only difference was that it was calling backdrop_array_merge_deep() instead of array_merge_recursive(). So I then converted it into a wrapper function for module_invoke_all() instead.
  2. Then in module_invoke_all(), I needed a way to detect whether the actual module_invoke_all() was called directly, or via module_invoke_all_merge_deep(). Since there is no in-built way in PHP to get the name of the caller function, I initially resorted to using debug_backtrace(), but after reading various sources on the internet, it turns out that using that function is not a good idea (performance issues - see here and here for example). Since module_invoke_all() accepts a "static" parameter followed by an unlimited number of dynamic parameters, we can't add a "flag" parameter to it unless it is the first parameter (and then unset() it, the same way we are currently unset()ing the $hook parameter). So I got that workaround in place, and it seems to work nicely.
  3. The above allows other code to decide whether it is appropriate for it to invoke hooks either directly via module_invoke_all() (backwards compatibility retained), or via its alternative module_invoke_all_merge_deep().
  4. It also introduces a helper _module_invoke_all_detect_duplicates() function, which simply logs a notice in the event log, to notify about any occurrence of duplicate results returned by the same hook implementation in various modules. It also states that the last module/hook (currently alphabetically) "wins" (but I have plans for that once we have #6241, so I've added a @todo there for that).
  5. It has a few dummy hook_requirements() implementations in 3 different modules. I am planning to move those into a test, but for now they live in the actual modules in order to provide something testable in the PR sandbox.
  6. Also has plenty of inline comments and docblock additions/edits to explain what we are doing and why.

To test in the PR sandbox:

Thoughts please? (especially @argiepiano and @bugfolder since you have both indicated that you are either affected by this problem in one of your contrib modules, or have otherwise encountered it elsewhere in the past)

argiepiano commented 9 months ago

@klonos, deep rabbit hole indeed! Thanks for all the work and thinking. While I haven't looked at this in depth yet, reading the description of what you are doing, the whole thing seems rather complex (and potentially prone to unexpected regressions). I'm wondering if there are easier, more localized ways to fix the problem that originated this work. For example:

So, in short this feels like using a "sledgehammer to crack a nut".

bugfolder commented 9 months ago

create an alter hook for hook_requirements_alter()

I favor this as the solution for dueling hook_requirements() implementations.

For the specific case that arose in #6121, of multiple modules needing to complain if cURL wasn't present, it's less than ideal if we're asking those modules to explain why they need cURL in their 'description'; they'll all have to "play nicely" by appending their error message to any existing 'description'. But that can be made to work; and using an alter hook is (relatively) simple and understandable.

However, the ability to detect duplicate hook_requirements() entries looks like, in and of itself, a very useful thing, and could save developers some head-scratching.