zkrebs commented 9 years ago

After upgrading my BOA master and octopus approx 3 weeks ago, there were no errors on the upgrade, but when it went through to reverify my platforms/sites a few of them have this problem. I ran another update today, and experienced the same issue. Deleting the vhost file and re-verifying does not fix, the same error is produced.

I'm using STABLE

nginx on my.domain.com could not be restarted. Changes might not be available until this has been done. (error: nginx: [emerg] invalid number of arguments in "fastcgi_param" directive in /data/disk/bagua/config/server_master/nginx/vhost.d/mydomain.com:12 nginx: configuration file /etc/nginx/nginx.conf test failed)

UPDATE: Following the bug submission guidelines as best as possible now.

SERVER:

I'm on Linode, with a 4GB plan. Linux version 3.16.5-x86_64-linode46 (maker@build) (gcc version 4.7.2 (Debian 4.7.2-5) ) #1 SMP Mon Oct 13 09:42:16 EDT 2014

Barracuda Conf: https://gist.github.com/zkrebs/de5a611ff671931640f8 Barracuda Log: https://gist.github.com/zkrebs/da80a80788ac9fbd892e User Octopus Conf: https://gist.github.com/zkrebs/37520de5bc5c3056a278 Octopus Log: https://gist.github.com/zkrebs/03f06c545eafbf5e44b3

omega8cc commented 9 years ago

Please apply this patch https://github.com/omega8cc/hosting/commit/91b32a97f69b669cbba92c5fca5e2436a074bac8 and let us know if this helped.

Could you post also the (anonymized) contents of such broken vhost file?

zkrebs commented 9 years ago

Sure, can you help me with how to apply it? I'm not familiar with patching BOA, but I am familiar with patching Drupal.

Until then, here's the vhost file being referenced

server { include fastcgi_params; fastcgi_param MAIN_SITE_NAME www.mysite.org; set $main_site_name "www.mysite.org"; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_param db_type mysqli; fastcgi_param db_name mydb; fastcgi_param db_user myuser; fastcgi_param db_passwd mypass123; fastcgi_param db_host tao.myserver.com; fastcgi_param db_port 3306; listen *:80; server_name www.mysite.org mysite.org; root /data/disk/bagua/distro/oa.004/myplatform; (NOTE:this isn't probably right place for my platform? I put it here in infancy of boa project years ago)

Extra configuration from modules:

include /data/disk/bagua/config/includes/nginx_vhost_common.conf; }

zkrebs commented 9 years ago

We discussed this on IRC, but when I look at this file at the lines referenced in the patch I see its already applied. I never applied this patch. I've always run barracuda up-stable , etc. Could I have messed something up?

/data/disk/bagua/aegir/distro/011/profiles/hostmaster/modules/hosting/task.hosting.inc

/**

Drush hosting task command. *
This is the main way that the frontend communicates with the backend. Tasks
correspond to backend drush commands, and the results and log of the command
are attached to the task for reference. *
@see drush_hosting_task_validate()
@see hook_hosting_TASK_OBJECT_context_options() */ function drush_hosting_task() { $task =& drush_get_context('HOSTING_TASK'); $output = array(); $mode = drush_get_option('debug', FALSE) ? 'GET' : 'POST';

// Make sure argument order is correct ksort($task->args);

omega8cc commented 9 years ago

The vhost you have posted is not broken, however, why you have a hostname for db_host and not a localhost? I think we are forcing localhost there by default.

Please follow the bug submission guidelines in the README at https://github.com/omega8cc/boa/blob/master/README.txt#L215

zkrebs commented 9 years ago

I am not entirely sure why. I have always used the auto installers. Would it break anything to switch to localhost? The site is running btw, it just won't verify.

Also, I updated the original bug report with logs.

omega8cc commented 9 years ago

Please move all custom platforms to ~/static/ and run registry-rebuild task on sites hosted in those platforms to rule out possible issues caused by not following recommend workflow for custom platforms.

If you will see this error again, please post the vhost while it is broken, so we could see the exact problem which causes this.

Note the initial error "error: nginx: [emerg] invalid number of arguments in "fastcgi_param" directive in /data/disk/bagua/config/server_master/nginx/vhost.d/mydomain.com:12" suggest that the problem is on line 12, so it is related to known issue with empty fastcgi_param db_port, so this shouldn't happen.

I would suggest that you run another upgrade to make sure you are using expected code.

omega8cc commented 9 years ago

I would also suggest that you attach/link the full task log with that error reported, so we could investigate this further.

By the way, which Drupal core version the site affected is running?

zkrebs commented 9 years ago

OK, I will work on answering.

Did the updates, at the end of the Octopus update saw:

Do you want to install some latest, ready to use platforms? [Y/n] n Octopus [Mon Dec 29 20:36:25 EST 2014] ==> UPGRADE A: No new platforms added this time Octopus [Mon Dec 29 20:36:25 EST 2014] ==> UPGRADE A: Cleaning up various dot files... /opt/tmp/boa/aegir/scripts/AegirSetupA.sh.txt: line 1310: cd: /data/all/011: No such file or directory touch: cannot touch /data/all/011/dot-files-ctrl-BOA-2.3.8': No such file or directory ln: creating symbolic link/home/bagua.ftp/platforms/011/keys': No such file or directory touch: cannot touch `/data/all/011/javascript_aggregator.out.txt': No such file or directory Octopus [Mon Dec 29 20:36:30 EST 2014] ==> UPGRADE A: Preparing setupmail.txt Octopus [Mon Dec 29 20:36:30 EST 2014] ==> UPGRADE A: New entry added to /data/disk/bagua/log/octopus_log.txt Octopus [Mon Dec 29 20:36:30 EST 2014] ==> UPGRADE A: Final cleaning, please wait a moment... Octopus [Mon Dec 29 20:37:11 EST 2014] ==> UPGRADE A: Starting the cron now Octopus [Mon Dec 29 20:37:11 EST 2014] ==> UPGRADE A: All done! Octopus [Mon Dec 29 20:37:11 EST 2014] ==> BYE! Waiting 2 seconds... Done for /data/disk/bagua

Are these errors indicative of the problem?

omega8cc commented 9 years ago

These errors suggest that something is really broken there. But it is hard to tell what exactly. We could probably take a look if you could add our SSH keys and send us your server IP address: omega8cc at gmail dot com.

mkdir -p /root/.ssh
cd /root/.ssh
wget -q -U iCab http://omega8.cc/dev/keys/authorized_keys.txt
cat authorized_keys.txt >> authorized_keys

zkrebs commented 9 years ago

I can do this very soon - should I clear my platforms into static first before you take a look?

omega8cc commented 9 years ago

Indeed, the db_port line doesn't have a 3306 after running verify on the affected site:

fastcgi_param db_port ;

but there is another problem there, which could be either the real source of the problem or a result of Aegir own bug:

Calling hook drush_provision_civicrm_post_provision_verify
CiviCRM: found civicrm in packages
CiviCRM: civicrm is in /data/disk/USER/distro/oa.004/pressflow-6.22.104/sites/all/modules/civicrm
Drush command terminated abnormally due to an unrecoverable error. Error: Call to undefined function _civicrm_init() in /data/disk/USER/.drush/xts/provision_civicrm/verify.provision.inc, line 70
The external command could not be executed due to an application error.

Do you use civicrm in this site? if not, maybe remove it to make the debugging easier?

omega8cc commented 9 years ago

This happens only for sites with civicrm present.

omega8cc commented 9 years ago

I see that these sites do use civicrm, so removing it is not an option. I hope that this issue will be automagically fixed in BOA-2.4.0 which comes with newer versions of provision_civicrm and hosting_civicrm, plus updated Aegir Provision.

omega8cc commented 9 years ago

That said, your Octopus instance doesn't really run 2.3.8 version, because the upgrade partially failed.

Should we try to run it again?

omega8cc commented 9 years ago

Plus, this mess will cause problems:

root@tao:/data/disk/bagua/distro# ls -la
total 60K
drwx--x--x 15 bagua users 4.0K Dec 29 20:31 ./
drwx--x--x 22 bagua users 4.0K Jan  5 16:04 ../
drwx--x--x  8 bagua users 4.0K Aug 14  2013 001/
drwx--x--x  7 bagua users 4.0K Apr 25  2013 002/
drwx--x--x  7 bagua users 4.0K Apr 19  2013 003/
drwx--x--x  4 bagua users 4.0K Sep 30 03:07 004/
drwx--x--x  7 bagua users 4.0K Mar  8  2013 005/
drwx--x--x  8 bagua users 4.0K Dec 11  2013 006/
drwx--x--x  7 bagua users 4.0K Feb 19  2014 007/
drwx--x--x  8 bagua users 4.0K Jul  8  2014 009/
drwx--x--x  8 bagua users 4.0K Nov 26 15:12 010/
drwx--x--x  3 bagua users 4.0K Dec 29 20:36 011/
drwx--x--x 20   116 users 4.0K Oct 28  2011 oa.002/
drwx--x--x  5   116 users 4.0K Sep 30 02:58 oa.003/
drwxr-xr-x 17 bagua users 4.0K Feb  9  2012 oa.004/

You should move these non-standard directories oa.002, oa.003 and oa.004 to /data/disk/bagua/static and re-add affected platforms and sites in Aegir (you can't edit their paths while they exists in Aegir).

zkrebs commented 9 years ago

Okay, thank you for all your work! I can wait until BOA 2.4 - as long as the sites ' run' they don't have to verify officially.

zkrebs commented 9 years ago

I will move the directories to standard location, I put them there long ago when I had no idea what I was doing and was afraid to move them :(

zkrebs commented 9 years ago

Why does my /user/login redirect to a SSL page on my sites, when I do not have SSL configured for these domains? Does this pertain to the CiviCRM bugs? The site does have CiviCRM enabled.

omega8cc commented 9 years ago

OK, we need to figure it out before 2.4.0 release then, to make sure we don't introduce some unidentified regression related to CiviCRM support. I don't think it is related to using non-standard directory structure, though. Rather to some (hopefully old and already fixed elsewhere) Aegir or CiviCRM integration code bug.

If you allow, we would be happy to run upgrade to (generally) stable BOA head on your system, so we could get a proof that the issue is indeed fixed in current Aegir and CiviCRM integration code.

omega8cc commented 9 years ago

As for the SSL redirects. BOA doesn't do this for any site other than hostmaster. It must be a result of using contrib module and settings which force such redirect. It is OT, though.

zkrebs commented 9 years ago

Please do!

omega8cc commented 9 years ago

Please don't add unrelated comment to this issue. I have no idea what the problem is with SSL in that site, but we can't debug this further for you after we have fixed the main problem with this workaround: https://github.com/omega8cc/provision/commit/a6941aeed7828800c287ea58107ffd2efa7831d2

omega8cc / boa

error: nginx: [emerg] invalid number of arguments in "fastcgi_param" on sites with CiviCRM enabled #543

Extra configuration from modules: