civicrm / cv

CiviCRM CLI Utility
26 stars 29 forks source link

cv with drupal7 multisite #54

Open pcurrier opened 4 years ago

pcurrier commented 4 years ago

I've been playing around with cv in a drupal7 multisite setup, and I've run into two related problems:

1) The only bootstrap levels that work are classloader/settings; booting drupal will fail (at the http redirect in includes/install.inc:install_goto()). It works if I set $_SERVER[HTTP_HOST] = 'mysite.org' in Bootstrap.php+CmsBootstrap.php, before they simulate the web environment. (Is there a way to detect this kind of redirect failure? Right now it fails silently which is not ideal.)

2) cv's search for civicrm.settings.php does not find anything. I'd expect cv to find the settings file if I first cd into [cmsroot]/sites/mysite (or one of its children), similarly to how it works with drush. But findDrupalDirs() tries to guess folder names under sites/ based on $options['httpHost'] (pulled from $_SERVER[HTTP_HOST]) which is empty. So it ends up checking just [cmsroot]/sites/ and [cmsroot]/sites/default.

So it seems like cv needs a "--hostname" option, that would be used to set $_SERVER[HTTP_HOST]. This would be analogous to drush's --uri option, and would address both problems.

I also wonder if the settings file search procedure should be changed to make findCivicrmSettingsPhp() also check all ancestor directories (starting with cwd) for the settings file. This will help with drupal's sites.php aliasing, where the site directory name and host name might be different.

Does something along these lines seem reasonable? You can see a preliminary version of it here: https://github.com/pcurrier/cv/commits/master

totten commented 4 years ago

Yeah, cv does need better support for multisite!

For a bit of background - there are two boot protocols in cv:

What I like about CIVICRM_BOOT is the adaptability...

So... I've been angling to recommend CIVICRM_BOOT as the main knob for fine-tuning the bootstrap. Naturally, that hasn't happened... cv still defaults to Bootstrap.php because I've been scared of breaking things and prefer an opt-in. I think cv is gonna have both protocols for the foreseeable future.

I also wonder if the settings file search procedure should be changed to make findCivicrmSettingsPhp() also check all ancestor directories (starting with cwd) for the settings file. This will help with drupal's sites.php aliasing, where the site directory name and host name might be different.

Heh, I hadn't read this issue and was playing with your PR and tried e0639c849090d46c64270623ad72aad2aae88c13. Great minds think alike? Or is that a different spin on similar idea? It allows:

$ cd /var/www/sites/example.com/files
$ cv url civicrm/admin
"http://example.com/civicrm/admin"

I have to admit... the sites.php/aliasing stuff scares me. šŸ™ƒ Maybe it's just because I've never used them. Or maybe it's a nagging sensation of non-determinism. It might help if you could explain this bit more. (Or post code... whatever's easiest.)

Tangentially: Do you think it's possible to take a CWD like /var/www/sites/example.com/files and startup Drupal... without starting up Civi (e.g. allowing CmsBootstrap.php to autodetect the subsite)?

pcurrier commented 4 years ago

Heh, I hadn't read this issue and was playing with your PR and tried e0639c8. Great minds think alike? Or is that a different spin on similar idea?

Yeah, originally I was testing something similar to this (searching ancestor directories for the settings file). But in the time between opening issue #54 and submitting PR #57, I changed direction and decided to teach findDrupalDirs() about sites.php (this is what you see in 7f013457d40ae4e8f32c1180ff5b9773cbc26a1a). Originally I wanted to avoid any unnecessary duplication of drupal logic in cv... but findDrupalDirs() is already copied directly from the D7 function for finding settings.php, so that didn't seem like much of an objection. But really I think either approach is fine.

I have to admit... the sites.php/aliasing stuff scares me. šŸ™ƒ Maybe it's just because I've never used them. Or maybe it's a nagging sensation of non-determinism. It might help if you could explain this bit more. (Or post code... whatever's easiest.)

There's really not much to it. sites.php is a mapping of site URLs to directory names. By default, if I have the site "foo.example.org" then drupal will expect to find the files for that site in the "sites/foo.example.org" directory. But if I want to name the directory "sites/myFolder", then in sites.php I would put:

$sites = array (
  'foo.example.org' => 'myFolder',
);

That's it -- it contains the $sites array, nothing else. It just allows site admins more control over how the site is structured. So technically, the current code in findDrupalDirs() is incomplete if it doesn't account for sites.php (in fact, it's likely that whoever wrote findDrupalDirs stripped out the sites.php parts when they pulled the logic from the drupal function).

Tangentially: Do you think it's possible to take a CWD like /var/www/sites/example.com/files and startup Drupal... without starting up Civi (e.g. allowing CmsBootstrap.php to autodetect the subsite)?

Yes, I think it should be possible, and in fact I was initially looking into doing something like that -- if CWD is /var/www/html/sites/example.com/files, then cv should be able to find the settings file without any hints from --hostname etc. What stopped me IIRC was that it seemed like there would be some sequence-of-events issues -- i.e. making sure that you've always figured out the CMS info before you look for civicrm.settings.php. Given that the CMS and Civi can currently boot in either order, this looked like it might require more elaborate surgery in cv's boot process, probably better done by someone who's not a cv neophyte like me. So I punted on that approach. I think this is exactly what you're getting at when you mention always booting the CMS first, and other chicken-or-the-egg issues? Perhaps I should revisit this -- do you think it would be simpler than I'm picturing?

On a somewhat related note, I've recently been testing the upgrade:get and upgrade:dl commands (after uncommenting them), and they work as expected, but they do need some minor tweaks for multisite. Basically, the --hostname param (and probably others like --level I think?) need to be passed down to sub-commands (e.g. when upgrade:dl calls vars:show), so that drupal can boot. Would you like to address this as part of this PR, or handle it when the upgrade commands eventually get enabled? I have some test code here: c4c0e54c14909216de7dd1e992a29d2001649b35 (might have merge conflicts with your multisite branch but I can rebase if it would help).

totten commented 4 years ago

On a somewhat related note, I've recently been testing the upgrade:get and upgrade:dl commands (after uncommenting them), and they work as expected, but they do need some minor tweaks for multisite. Basically, the --hostname param (and probably others like --level I think?) need to be passed down to sub-commands (e.g. when upgrade:dl calls vars:show), so that drupal can boot.

I suspect it would be cleanest to switch those commands from --level=full to --level=cms-only or --level=cms-full and then do any fine-tuning of bootstrap via env-var (CIVICRM_BOOT). The env-vars are propagated to subcommands automatically.

. Given that the CMS and Civi can currently boot in either order, this looked like it might require more elaborate surgery in cv's boot process, ...

Yeah, that is a curveball. It helps me to think of those as two entirely separate protocols and deal with them at different times. So do some work on the Civi-first protocol (Bootstrap.php); do a mental reset; and then separately work on the CMS-first protocol (CmsBootstrap.php).

But if I want to name the directory "sites/myFolder", then in sites.php I would put:

$sites = array (
 'foo.example.org' => 'myFolder',
);

OK, seeing it in that format makes it easier to recognize the ambiguity, e.g. this looks to be valid:

$sites = array (
  'foo.example.org' => 'myFolder',
  'bar.example.org' => 'myFolder',
);

If you're in the folder /var/www/sites/myFolder/files and you inspect the $sites array, then you could guess that foo.example.org or bar.example.org would be legit values of HTTP_HOST (but you wouldn't know which is preferred). That's probably OK -- provided that both values are actually legal HTTP_HOST values.

The rub is that these rules are complicated. The path ./sites/123.a.b.c (or alias $sites['123.a.b.c']) could correspond to http://a.b.c:123 or http://123.a.b.c or http://123.a.b/c or (I think) http://z.y.a.b.c:123. I see how to infer some URI that will provoke Drupal into loading the intended settings.php, but that doesn't mean we'll get the preferred HTTP_HOST or even a valid one. (It's nice if URLs generated in CLI tasks are valid...)

So I punted on that approach. I think this is exactly what you're getting at when you mention always booting the CMS first, and other chicken-or-the-egg issues? Perhaps I should revisit this -- do you think it would be simpler than I'm picturing?

I think there are two basic options here: