joomla / joomla-cms

Home of the Joomla! Content Management System
https://www.joomla.org
GNU General Public License v2.0
4.73k stars 3.64k forks source link

[4.2] Booting a component in CLI application breaks CLI #38518

Closed nikosdion closed 2 years ago

nikosdion commented 2 years ago

Tagging @roland-d @fancyFranci @wilsonge @HLeithner @laoneo

Steps to reproduce the issue

Expected result

You get a list of CLI commands in both tests.

Actual result

Test 1 works.

Test 2 returns an error: Could not parse the requested URI http:///var/www/test/cli/joomla.php

System information (as much as possible)

I have reproduced it under PHP 7.4, 8.0 and 8.1. It really doesn't matter. Only requirement is Joomla 4.2 AND using a full path to the joomla.php file.

Additional comments

This is a RELEASE BLOCKER issue: it breaks the CLI commands which were working perfectly fine in Joomla 4.0 and 4.1.

Considering that the only way to set up CRON jobs on most hosts is to use the full path to joomla.php there is no way for users to work around that.

Booting the component is supposed to be the only recommended way for developers to access the MVC objects of their components from any application: site, administrator, api and cli. This is a release blocker because it breaks the only recommended way to access your component under the CLI application.

The problem is that MVCFactory tries to register the SiteRouter service even under CLI. However, when we are under CLI we do not have a URL to the site.

The SiteRouter service tries to instantiate the SiteApplication which eventually runs the \Joomla\CMS\Application\WebApplication::loadSystemUris() method with no arguments.

This calls the \Joomla\Application\AbstractWebApplication::detectRequestUri() method which ends up using http:// plus the contents of the $_SERVER['REQUEST_URI'], the latter containing the first argument to the immediate right of the PHP binary in the command line.

If you are running the CLI application with a relative filename (joomla.php) this becomes http://joomla.php which, though wrong and completely bogus, is actually a valid domain name. When you are running the CLI application using a full filesystem path this becomes http:///var/www/test/joomla.php. Note the triple slash. This triple slash makes the whole thing an invalid URI and we get an exception from Joomla\CMS\Uri\Uri.

There are THREE solutions to that:

  1. Bad: make the SiteRouter service late-bound. By that I mean that the Joomla\CMS\Extension\Service\Provider\MVCFactory will NOT set the SiteRouter service to the MVCFactory but instead pass the container so that getSiteRouter can create the service when needed. This gives us developers some leeway to set up the correct URI (it's possible with some Reflection magic).
  2. Ugly: Make the site's normative URL ($live_site) required. This is all sorts of wrong as it makes multi-site all but impossible. Unless, of course, we create a new --site-url CLI application option to let the user decide WHICH multisite they wanna use.
  3. Uglier: Create a new \Joomla\CMS\MVC\Factory\CliMVCFactory which does not need the SiteRouter service. This means that you can't generate SEF URLs from the CLI which is even worse as we can now no longer send emails from the CLI, making it completely useless.

In my opinion the second solution might be less disruptive, especially if an empty live_site in configuration.php is auto-fixed (if the configuration.php file is writeable) the first time a Super User visits the backend of their site.

Alternatively, we could undo whatever change in 4.2 caused that problem but for the life of me I cannot find out what that change is!

HLeithner commented 2 years ago

I think forcing --site-url is the best forward compatible way, because if we get Multi-Site/Multi-Domain in the future we need this anyway because the cli doesn't know which site context should be used.

  1. is more magic and in the end you also have to say which site-url is the current one, not sure if it would be possible to override the --site-url parameter in a system plugin early enough.
  2. even if it's ugly I don't see an better fail safe way (except setting the live_site what I would really like to avoid)
  3. unable to create a sefurl (or better any url because you still don't have the domain) is a no go for me (thinking about generating email where you need the hostname/url)

but for 4.2 we need a less problematic way. finding the pr that makes the troubles should be the first way.

Related pr https://github.com/joomla/joomla-cms/pull/38107 (not the cause because not merged^^)

nikosdion commented 2 years ago

@HLeithner We can't JUST add --live-site in CLI because this breaks all existing CLI commands, including those which do not care about having the site's URL (e.g. taking a backup, setting the site on/off-line etc).

Here is an even better idea, now that I have debugged this six ways to Sunday and have actually found a viable workaround.

We need a TWO-PRONGED approach.

STEP 1: Add the --live-site CLI parameter as I said before. What it does is set the uri.request application parameter AND override the $_SERVER['HTTP_HOST'] superglobal array key. I know, shocking, but it works.

STEP 2: Update \Joomla\Application\AbstractWebApplication::detectRequestUri(). When it calls Uri we can add a try-catch around it. If it throws we don't go all Rocky 4 on it and end with “if he dies, he dies”. No. We catch the RuntimeException and use the FAKE base URI http://example.com as our base URI. This is enough to let the CLI application run AND it allows 3PDs like yours truly to override it again at a later point, in our console command, to create SEF URLs for use in emails or whatever we please.

I can whip up a PR to do that tonight, if you agree with my approach.

brianteeman commented 2 years ago

am i being stupid here but as this worked correctly before 4.2 surely the first thing to do is to find what changed and broke this and just revert it.

nikosdion commented 2 years ago

For reference, my current workaround code:

class CliAntiBorkage
{
    /**
     * Unbork the routing under the CLI application.
     *
     * @see   https://github.com/joomla/joomla-cms/issues/38518
     *
     * @param   string                        $component
     * @param   string                        $configKey
     * @param   CMSApplicationInterface|null  $app
     *
     * @throws \Exception
     * @since  7.2.0
     */
    public static function unborkRouting(string $component, string $configKey, ?CMSApplicationInterface $app = null)
    {
        // Make sure we have an application object
        $app = $app ?? Factory::getApplication();

        // Am I under the CLI application?
        if (!$app->isClient('cli'))
        {
            return;
        }

        // Does the CLI application break ignominiously when I try to get the current URL?
        try
        {
            Uri::getInstance();

            return;
        }
        catch (\Exception $e)
        {
            // Yup, I expected this to happen.
        }

        // Do we have a non-empty live_site URL which doesn't break the application?
        try
        {
            $siteURL = $app->get('live_site', '') ?: null;
            $uri     = Uri::getInstance($siteURL);
        }
        catch (\Exception $e)
        {
            $siteURL = null;
        }

        // If there was no live_site URL let's try to get a better one.
        if (empty($siteURL))
        {
            $siteURL = ComponentHelper::isEnabled($component)
                ? ComponentHelper::getParams($component)->get($configKey, 'https://www.example.com')
                : 'https://www.example.com';
        }

        // Override the superglobal variables to "trick" Joomla into compliance.
        $uri                    = Uri::getInstance($siteURL);
        $_SERVER['HTTP_HOST']   = $uri->toString(['host', 'port']);
        $_SERVER['REQUEST_URI'] = $uri->getPath();
    }

}

I use this in my service providers like so:

        $app = Factory::getApplication();

        if ($app->isClient('cli'))
        {
            if (!class_exists(CliAntiBorkage::class))
            {
                return;
            }

            CliAntiBorkage::unborkRouting('com_admintools', 'siteurl', $app);
        }

This is fully tested now, it fixed the previously experienced borkage. If you are wondering what I do with the ComponentHelper, when you access the Control Panel page of Admin Tools I store the frontend base URL in the siteurl component parameter. I was using that for years to be able to send emails from CLI, e.g. when you run a site scan for changed or suspicious files. Obviously the core solution would not be using the ComponentHelper, it would use the uri.request configuration variable of the global application.

HLeithner commented 2 years ago

@HLeithner We can't JUST add --live-site in CLI because this breaks all existing CLI commands, including those which do not care about having the site's URL (e.g. taking a backup, setting the site on/off-line etc).

As I mentioned too we can't require this in a minor

but for 4.2 we need a less problematic way

Here is an even better idea, now that I have debugged this six ways to Sunday and have actually found a viable workaround.

We need a TWO-PRONGED approach.

STEP 1: Add the --live-site CLI parameter as I said before. What it does is set the uri.request application parameter AND override the $_SERVER['HTTP_HOST'] superglobal array key. I know, shocking, but it works.

STEP 2: Update \Joomla\Application\AbstractWebApplication::detectRequestUri(). When it calls Uri we can add a try-catch around it. If it throws we don't go all Rocky 4 on it and end with “if he dies, he dies”. No. We catch the RuntimeException and use the FAKE base URI http://example.com as our base URI. This is enough to let the CLI application run AND it allows 3PDs like yours truly to override it again at a later point, in our console command, to create SEF URLs for use in emails or whatever we please.

I can whip up a PR to do that tonight, if you agree with my approach.

Not sure why we need to fall back? I mean can't we simply set the HTTP_HOST unconditional to example.com if no live_site and no --live-site parameter is set?

nikosdion commented 2 years ago

@brianteeman I have followed the code and nothing changed in the CMS. This means that something changed in the Joomla Framework. I also studied that code for over 3 hours before filing this issue. The code is NOT wrong. Whatever we were doing before must've been a bug which happily resulted in masking the major issue of not having thought about how we can pass the site's URL to the CLI application.

In short, it appears that we had a small bug canceling out a major bug. The small bug was fixed, the major bug was not. I reckon fixing the remaining major bug is a better approach to re-introducing a smaller bug, not just because it's the right way to develop software but also because we will mathematically end up in the same situation again in the future.

HLeithner commented 2 years ago

am i being stupid here but as this worked correctly before 4.2 surely the first thing to do is to find what changed and broke this and just revert it.

of course that should be the first thing to find why it breaks now, but doesn't solve the problem that you can't emails for example

nikosdion commented 2 years ago

@HLeithner

As I mentioned too we can't require this in a minor

I am with you, mate! I do NOT want to make $live_site a requirement. It's a bad idea to use it for many reasons. That's why I am interested in the better solution :)

Not sure why we need to fall back? I mean can't we simply set the HTTP_HOST unconditional to example.com if no live_site and no --live-site parameter is set?

Basically, yes, that's the idea, see https://github.com/joomla/joomla-cms/issues/38518#issuecomment-1219716101

The CliApplication would run (the evolution of) that method, passing it the value of the --site-url option. If it's empty, it will instead use http://www.example.com. This would be sufficient for WebApplication to not explode on our face.

I hope that this clarifies what I was aiming for. Sorry, I was trying to write something coherent with a VERY excited 4 ½ year old running around the house demanding that she assists mommy cooking dinner.

HLeithner commented 2 years ago

I think (even if setting super global is bad) the only practical way, especially because the framework uses (of course) the super global to get the hostname (should be fixed to but can't think of a way in the moment).

So please go forward and propose a pr.

nikosdion commented 2 years ago

Thank you! Will do right after dinner.

nikosdion commented 2 years ago

I have a pertinent question. The necessary code needs to go into \Joomla\Console\Application::execute which is part of the Joomla Framework package joomla/console. Should I do the PR in the Console repo and after it is merged have one of you maintainers publish a new version and update the CMS dependencies?

HLeithner commented 2 years ago

hmm tbh I didn't looked at the cli architecture in deep but I'm wondering how the framework package could read joomla configuration. Also not sure when the webapplication get's loaded because we only load the console application?!

HLeithner commented 2 years ago

OK seems you don't need the framework console application because we load libraries/src/Application/ConsoleApplication.php as console application

nikosdion commented 2 years ago

OK, thank you. I'll get right to it.

roland-d commented 2 years ago

Alternatively, we could undo whatever change in 4.2 caused that problem but for the life of me I cannot find out what that change is!

I have this issue even in a 4.1.3 site where I wrote a bash script to run the Joomla CLI due to this error. So I don't think it is a 4.2.0 specific issue but older than that.

nikosdion commented 2 years ago

@roland-d I have a very large installed base of CLI users — between Akeeba Backup, Admin Tools and Ticket System it's a few tens of thousands of sites. We didn't have that problem before.

HOWEVER there is some nuance to this statement. Very early in the development of our Joomla 4 versions we had observed that every time we tried to use Joomla's Route::_() static method we'd get a crash. I developed a workaround. This worked fine before 4.2.0.

In 4.2.0 the crash happens when the MVCFactory service is instantiated for a component. This happens far earlier in the execution process, for some plugins as early as its service provider creating the plugin extension object. Therefore the previous workaround which ran right before a command would no longer work. This is what changed between 4.0/4.1 and 4.2.

nikosdion commented 2 years ago

The Pull Request has landed https://github.com/joomla/joomla-cms/pull/38524 Please test. Thank you!

zero-24 commented 2 years ago

Closing here thanks Nicholas

nibra commented 2 years ago

I have a pertinent question. The necessary code needs to go into \Joomla\Console\Application::execute which is part of the Joomla Framework package joomla/console. Should I do the PR in the Console repo and after it is merged have one of you maintainers publish a new version and update the CMS dependencies?

Yes, please!

nikosdion commented 2 years ago

@nibra I'll open an issue on the console repo because there are a few questions about how that would work.

wilsonge commented 2 years ago

So I still hate changing superglobals so raised a tweaked (although architecturally bigger change) PR in #38544

Also worth noting as it hasn't been so far for the record that we have an undocumented configuration.php variable called site_uri https://github.com/joomla-framework/application/blob/2.0.2/src/AbstractWebApplication.php#L962 that if set also 'fixes' this. Yeah I didn't know either until I went digging through allllll the code last night 🤷‍♂️ also I don't know why it's a different name to live_site and not aliased before you ask :)

nikosdion commented 2 years ago

@wilsonge I also did not like changing superglobals and was thinking about something along the lines of what you did. However, there were a few more things to keep in mind:

Meanwhile, the quick'n'dirty solution was merged and it does indeed fix the problem with CLI application in 4.2.1-rc2 — I have real world sites and clients to that effect.

We are one day before 4.2.1 stable. I would very strongly suggest keeping in Joomla what we KNOW AND HAVE TESTED works and merge your PR in 4.3-dev, after having tested the living crap out of it.

If we make one more release breaking CLI I can guarantee you that I am going to stop using the Joomla CLI Application and go back to the previous situation of one script per task. If Joomla removes the deprecated CliApplication I will bloody include something like that in my own code or just reinvent the wheel. The thing is, whenever you are breaking CLI you are breaking the selling point of my software (automation). If I have to choose between loosing clients and not using Joomla I will do the latter. When I do the latter people take notice and follow my example even though (or, rather, especially because) they don't know why I did what I did. So, let's not possibly cock up CLI for a second release in a row.

nikosdion commented 2 years ago

@wilsonge I actually see how your PR would break CLI again. I will comment on that on your PR.

vijaykhollam commented 1 year ago

Looks like it is not resolved yet. I am facing the same error when executing CLI in Joomla 4.3.2.

Error : "Could not parse the requested URI http:///var/www/projectname/cli/test.php" i.e php cli/test.php