joomla / joomla-cms

Home of the Joomla! Content Management System
https://www.joomla.org
GNU General Public License v2.0
4.76k stars 3.64k forks source link

Upload and Update failures #38702

Closed brianteeman closed 1 year ago

brianteeman commented 2 years ago

Seeing a lot of failures from people using the upload and update feature so investigated further. The problem is that they are updating with the FULL package and not the UPDATE package.

Why does this happen?

image

There are TWO different messages here and the wording implies they are different methods

message 1

message 2

Part 1 - Full package

Steps to reproduce the issue

  1. Perform a clean Install Joomla 4.2.0
  2. download the FULL joomla 4.2.2 package
  3. in the joomla update component use "upload and update" with the FULL package from 4.2.2

Actual result

The update is not complete - despite saying it is

For example image image

Part 2 - Update package

Steps to reproduce the issue

  1. Perform a clean Install Joomla 4.2.0
  2. download the UPDATE joomla 4.2.2 package
  3. in the joomla update component use "upload and update" with the UPDATE package from 4.2.2

Actual result

The update is complete

For example image

image

Additional comments

The problem is much worse if you do the same as above but starting from 4.1.5 image image

Proposals

[ ] Update, clarify and simplify the text message [ ] Add a filter to the file upload box to check that it is an update [ ] Any other suggestions

brianteeman commented 2 years ago

(tagging @nikosdion @crystalenka @coolcat-creations )

coolcat-creations commented 2 years ago

Can we add a check in the FULL Packages to perform an update if it's an update? Why shouldn't it be possible to put a full package over an existing one? Maybe sometimes it would be even good to restore all core files?

I agree to change the text and simplify it.

brianteeman commented 2 years ago

Maybe sometimes it would be even good to restore all core files?

yes on the exact same version - not as an update

nikosdion commented 2 years ago

I can add some of the backend stuff.

Why are we restricting to zip format?

Because this is the only thing we can realistically support.

ZIP files are essentially a series of records. First a file descriptor record, then the file data (compressed or not). This means that we can stream their extraction over multiple page loads, the only thing we need to know is the name of the archive, the last file position we were in and whether we are in the middle of extracting a file (and if so, what is its name whether it's compressed). Since we can stream the extraction over multiple page loads we avoid PHP timeouts, server CPU usage limits etc which would make the update fail — something that Joomla 1.6.0 up to 2.5.0 failed to do.

tar.gz and tar.bz2 are compressed TAR archives. The way this works is by creating a TAR archive and then compressing the entire archive file. The only way to extract these archives is to load them entirely in memory, uncompress them in memory and start extracting them from memory, all in one page load. Not only this requires a PHP memory_limit well over 128MiB, it will also time out. We could theoretically overcome this IF AND ONLY IF we limited the installation and usage of Joomla to servers which have BOTH the Gzip AND BZip2 stream wrappers installed which means Joomla runs locally but not on a lot of live servers. There are also other caveats, like the fact that forwarding the file pointer in a compressed stream requires decompression to take place which for large archives like Joomla's update packages could mean that we run into a PHP timeout or a CPU usage limit anyway.

So, basically, we only support ZIP files because we can ONLY guarantee that updates will work on the vast majority of servers with this format. Trying to support anything else would fail far more frequently than it would work, leading to bricked sites. I am not going to contribute code which leads to bricked sites for reasons I am aware of and can't do anything about. I refuse.

Can we add a check in the FULL Packages to perform an update if it's an update?

No, we cannot. There is no “magic indicator” in the ZIP file and we cannot restrict updates based on the file name (it's both frustrating for legit users and provides zero protection against someone renaming the file).

The only possible technical solution would be to change the build.php script to always add a dummy file with a predictable name (e.g. administrator/components/com_admin/_isUpdate.xml) to the update package. If that was the case we could check if this is the very first file listed in the ZIP archive by looking for a hardcoded string in a hardcoded byte offset from the start of the file thanks to the wonderfully simplistic way ZIP files are structured. In this case we could place the “is this really an update?” check at the upload handling code in the Joomla Update model, long before we concern ourselves with extracting the archive.

Caveat: Implementing the hardcoded file check would make it impossible for people to create custom upgrade packages. Whether that's something that's being done and/or we want to support is another matter altogether and above my pay grade 😉

Why shouldn't it be possible to put a full package over an existing one?

Because the installation package contains the installation directory.

The presence of this directory causes every request to Joomla's index.php files to redirect to that directory. This means that the last two steps of the installation (post-update which updates the database and clean-up) which need to go through the com_joomlaupdate backend never get to run when you are using a full installation package.

Further thoughts

This is ultimately a user education issue, not really a code issue (see above on why checking if this is really an update package is currently not realistic and may have unintended consequences if implemented).

The messages DO need to change and I'd actually say that the messages for the component shouldn't have hardcoded URLs to begin with (see COM_JOOMLAUPDATE_NODOWNLOAD_EMPTYSTATE_CONTENT and COM_JOOMLAUPDATE_VIEW_DEFAULT_NO_DOWNLOAD_URL_DESC). What if Joomla restructures its sites and the download section becomes a section of the main .org site — as it should be — instead of yet another subdomain?

Moreover, the downloads site needs two or three changes:

  1. List the update packages in the /latest URL.
  2. Highlight (bold face) the ZIP packages for both installation and update packages.
  3. (maybe, ask @crystalenka) Add indicators about which files are NEW installation packages and which are UPDATE packages.
brianteeman commented 2 years ago

Thanks for the explanation about the zip restriction (I removed that part from the original post)

~For the filename I was thinking of having a hidden field with the name of the update file (must be able to do that programatically as its in the message link) and then validate that the filename of the user supplied zip matches. There might be better ways to validate the filename that I dont know. I really dont care about a user generating their own update packages and expecting to be able to use com_joomlaupdate with them. Would that even happen?~

agree 100% about the changes to the d/l site @HLeithner

nikosdion commented 2 years ago

Maybe sometimes it would be even good to restore all core files?

You can already do that by doing an upload and update with the full update package. It's nearly identical to the full installation package minus the installation folder.

Speaking of which...

It is totally possible to add a filter to the extraction code to skip over the entirety of the installation folder and its contents. This is trivial, we just need to change readFileData to detect the installation directory and its files, move the file pointer forward by $this->fileHeader->compressed bytes and set $this->runState = self::AK_STATE_NOFILE. Easy peasy!

Does @roland-d, @fancyFranci, @HLeithner or @laoneo know of any reason why simply skipping over the installation directory wouldn't work?

For the filename I was thinking of having a hidden field with the name of the update file (must be able to do that programatically as its in the message link) and then validate that the filename of the user supplied zip matches.

I can guarantee that someone will just rename the file and be surprised it doesn't work. Skipping over the installation folder sounds like a far more robust idea, as in: it wouldn't matter which package you upload.

brianteeman commented 2 years ago

skipping over the installation folder will almost certainly not work - i'm fairly confident that this is what happened automatically during my tests and left the site in a not 100% updated state.

but that did give me an idea. can you detect the presence of /instalaation in the zip right at the beginning and then bail out at that point saying you have the wrong file etc?

dgrammatiko commented 2 years ago

but that did give me an idea. can you detect the presence of /instalaation in the zip right at the beginning and then bail out at that point saying you have the wrong file etc?

Yes this is easily done with js, before the zip even reaches the server. I’m doing such things for some years now…

edit: all I’m saying is that you can read the zip when a user selects the file for upload, check if the installation folder exist and either remove it or bail out, both options are doable. The whole thing is just few lines of js on the upload ui…

HLeithner commented 2 years ago

Only removing the installation folder on update wouldn't work, for example also the images folder is not package in update files. Further it's possible that more files/folders don't need to be "patched" but initially shipped, patching the updater doesn't sounds good.

I would check for the correct filename on upload, if people thinks they have to rename it then they should rename it that the update can understand for example:

Joomla_4.2.2-MYCUSTOMBUILD-Stable-Update_Package.zip or something like that

dgrammatiko commented 2 years ago

@HLeithner actually you don’t have to rely on conventions, read the zip file, read the Joomla.xml or anything else that provides info about the installable and act on real data not on clumsy conventions

nikosdion commented 2 years ago

@brianteeman What you describe has two problems. One, we DEFINITELY do not skip over the installation folder; I wrote that code based on Kickstart and I have removed all traces of extraction filtering. Two, if you look at Joomla's build/build.php you will see that the only difference between the upgrade and the full package is the presence of administrator/manifests/packages/pkg_search.xml.

I would recommend ACTUAL tests.

@HLeithner

Only removing the installation folder on update wouldn't work, for example also the images folder is not package in update files.

This is not as important as ending up with a bricked site because the full installation package was extracted but the update finalisation did not work.

Further it's possible that more files/folders don't need to be "patched" but initially shipped, patching the updater doesn't sounds good.

That could be an issue BUT, again, it's not as important as a bricked site.

@brianteeman

can you detect the presence of /instalaation in the zip right at the beginning and then bail out at that point saying you have the wrong file etc?

While you can do that there are two problems:

  1. You already have created the restoration.php file so it needs to be cleared before returning with an error. Doable but not ideal as I explained in a previous PR.
  2. This will ONLY EVER WORK if the DIRECTORY entry for the installation directory is the VERY FIRST FILE RECORD in the ZIP package. Otherwise you have already extracted files and you have a half–upgraded site. Currently there is no guarantee for that.

@dgrammatiko

all I’m saying is that you can read the zip when a user selects the file for upload, check if the installation folder exist and either remove it or bail out, both options are doable.

This is not really possible in PHP without running the risk of running out of memory or a server timeout because of the size of the update package and the number of files it contains. Doing that requires reading the archive until you find the Central Directory record, then read the Central Directory Entry records one by one to determine if installation/index.php exists. You cannot just search for the CD signature, you actually have to read each File Header record and skip over as many files as its compressed size to find the next header file record OR go backwards and start reading from the end of the file until you find the CD record signature (most ZIP software, including PHP's ZipArchive does the latter).

Here's the problem with PHP.

PHP versions before 8.0 (IIRC) were reading the entire ZIP archive in memory to list its records. If you are on a server with a limited amount of PHP memory, e.g. 32MiB which otherwise runs Joomla 4 just fine, you'd end up with a memory outage, preventing you from updating Joomla with Upload & Update altogether.

Writing user land code to go backwards to find the ZIP CD record may take so much time on some slower servers which will lead to a PHP timeout record.

It would be FAR MORE EFFICIENT to have a constant length ZIP comment which is always placed at the end of the archive as per the spec and base our decision on the contents of the constant length comment. Even if the comment is variable length, seeking back from the end of the archive to find the ZIP Comment Record signature takes nanoseconds even on dead slow shared hosting (that's how old versions of Kickstart reported the size of the ZIP archive; I removed that UI and the backend code about a decade ago as it was not necessary in my use case).

I would check for the correct filename on upload

I really do not like this idea. We are relying on information which is under the complete control of the user to determine if we're gonna brick their site or not.

I would much rather use a short JSON document as a ZIP comment with a structure like this:

{
  "software": "joomla",
  "version": "4.3.0-dev",
  "type": "upgrade",
  "timestamp": 12345678,
}

where type would be one of full, upgrade, fullupgrade.

ZIP comments are ALWAYS put at the end of the archive as per the ZIP specification.

This means we can read the last 64KiB and start looking backwards for the EOCD signature (0x06054b50), skip 22 bytes (the length of the EOCD record) and parse the rest as JSON. If the rest is empty we consider this to not be an original Joomla package and refuse to install it. This check can be placed after the upload and before creating the restoration.php file so it also applies to the files retrieved over HTTPS, ostensibly from Joomla's update servers.

In the future this can be further expanded to include digital signatures or similar tamper–evident measures to prevent illegitimate downloads. I had proposed that in 2016, actually.

nikosdion commented 2 years ago

@dgrammatiko

read the zip file, read the Joomla.xml or anything else that provides info about the installable and act on real data not on clumsy conventions

Again, the problem is that we can NOT reliably do that without risking out of memory or timeout errors on shared hosting. If it was that easy we wouldn't need Joomla Update, we'd still be using com_installer to install core updates.

Of course everyone has thought of the “obvious” solution. However, the “obvious” solution has non–obvious problems, unless someone believes that a very large ZIP archive on a very slow shared host takes inconsequential memory and time to read, i.e. they have unrealistic expectations.

Reading the file directly works great if you have no CPU usage, memory or maximum execution time limits, or are on a very fast server, or talking about fairly small (under ~2 to 5MiB, depending on the server) ZIP packages.

Yeah, for updates that small or environments without limitations the obvious solution is the best one. For mass-distributed software like Joomla very likely to be found running on servers with less resources than a modern Internet connected coffee maker (I am not even joking!) we need to play it far more conservatively and wisely. When we didn't — between Joomla 1.6.0 and 2.5.0 — we had people complaining that the update never worked. That's why Joomla Update was added in an emergency release (2.5.1).

brianteeman commented 2 years ago

What you describe has two problems. One, we DEFINITELY do not skip over the installation folder; I

I did NOT say we did

nikosdion commented 2 years ago

@brianteeman Sorry, it read like your tests showed we did. Maybe I misunderstood you the way it was phrased.

brianteeman commented 2 years ago

Caveat: Implementing the hardcoded file check would make it impossible for people to create custom upgrade packages. Whether that's something that's being done and/or we want to support is another matter altogether and above my pay grade 😉

Sureely thats no different to checking the name of the zip file

nikosdion commented 2 years ago

Sureely thats no different to checking the name of the zip file

Find twenty random Joomla users. Ask them to perform two tasks:

  1. Rename a file
  2. Place a file with a specific filename as the first file of a ZIP archive

Tell me which of these tasks they find to be significantly easier than the other and which of these tasks most people fail to accomplish altogether.

brianteeman commented 2 years ago

Exactly - so asking someone who creates there own update file to give it a specific name is the easier solution

nikosdion commented 2 years ago

The idea was “if we don't want to support custom update files then...”.

My problem with relying on file names is that they are not fool–resistant in that they offer no guarantee against using an installation package as an update package by a sufficiently stubborn fool.

So, it depends on which problem you want to solve: a. accidental upload of the wrong package file; or b. the fact that trying to use the installation package will result in an unusable site?

The answer as to what you should do to solve each of these two problems is very different. Currently, everyone in here seems to think that both problems are identical but, in fact, they are not.

dgrammatiko commented 2 years ago

Again, the problem is that we can NOT reliably do that without risking out of memory or timeout errors on shared hosting. If it was that easy we wouldn't need Joomla Update, we'd still be using com_installer to install core updates.

Of course everyone has thought of the “obvious” solution. However, the “obvious” solution has non–obvious problems, unless someone believes that a very large ZIP archive on a very slow shared host takes inconsequential memory and time to read, i.e. they have unrealistic expectations.

Reading the file directly works great if you have no CPU usage, memory or maximum execution time limits, or are on a very fast server, or talking about fairly small (under ~2 to 5MiB, depending on the server) ZIP packages.

Yeah, for updates that small or environments without limitations the obvious solution is the best one. For mass-distributed software like Joomla very likely to be found running on servers with less resources than a modern Internet connected coffee maker (I am not even joking!) we need to play it far more conservatively and wisely. When we didn't — between Joomla 1.6.0 and 2.5.0 — we had people complaining that the update never worked. That's why Joomla Update was added in an emergency release (2.5.1).

@nikosdion did you actually READ what I wrote? When someone on the UI drops the file (uploads it) to the server the CLIENT SIDE ie the browser has a copy of the zip. Thus what I said was check the .zip and respond accordingly with either:

Where the heck is PHP even part of the process with what I proposed?

Also for inspiration: https://github.com/ttc-freebies/plugin-responsive-images/blob/4.0.0/site/src_media/js/utils.js

Over and out

dgrammatiko commented 2 years ago

One more comment here. @brianteeman try to upload the full installation zip file here: https://codepen.io/dgrammatiko/pen/VwxLxPW?editors=1000

Is this the solution that you are looking for?

nikosdion commented 2 years ago

Wait, you want to load an entire JS library with all its dependencies to read ZIP files client–side to check if a file exists...? Have you heard of the term "overkill"?!

It's much easier to check the contents of the file being uploaded for a binary string to determine if it contains installation/index.php which sufficiently determines if it's an installation package. See https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT section 4.3.7.

The local file header for the installation/index.php is

I don't think it'd be too hard for someone with a fair amount of JavaScript experience to write JS code to look for that without having to load half of the known universe in dependencies (with all their vulnerabilities) into Joomla. Just saying.

The way I'd probably go about it is look for the offset of 0x0016 followed by installation/index.php. Once found I'd roll 26 bytes before that offset and check if the four bytes I read are 0x04034b50

You don't need to build the bloody Death Star — complete with the exhaust port that leads directly to the core reactor — to kill a mosquito!

dgrammatiko commented 2 years ago

Wait, you want to load an entire JS library with all its dependencies to read ZIP files client–side to check if a file exists...?

A. it's lazy loaded B. It's a worker C. It's not even that big, jQuery is bigger...

brianteeman commented 2 years ago

Is this the solution that you are looking for?

I dont care what the technical solution is. I just want to prevent users from breaking their web sites on update and one part of that is making sure that they use an update package and not a full install

nikosdion commented 2 years ago

And it still makes no sense to load a library for a simple text search just because you — or whoever — can't be bothered to read the ZIP specification! I did it for you and I reduced your problem solution to a simple string search which will NEVER change because it's part of the core ZIP specification which has not, can not and will not change since 1989.

Every dependency we add to this mass-distributed software called Joomla adds another potential vulnerability vector. If it's necessary to provide a feature which we cannot provide any other reasonable way then and only then the risk is acceptable. When the feature we are trying to provide is a simple string search it is not logical to add this potential attack vector.

What happens if five years down the line we discover that the library has a security issue and it's abandoned by its author? Are we now forced to find someone who understands BOTH JavaScript AND the ZIP specification at a deep enough level to rewrite it?

nikosdion commented 2 years ago

I dont care what the technical solution is.

I very much do. Choosing the WRONG technical solution will not only become baggage we have to drag on forever but could potentially result in a vulnerability vector in a fundamental core feature which would dissuade users from updating their sites, exposing them to even worse security issues!

Don't look at the tree and miss the forrest, dammit!

brianteeman commented 2 years ago

jeez - i wish I had never spent 2 hous before breakfast working out exactly how all these users were breaking their sites on update

nikosdion commented 2 years ago

And I am spending 4 hours every day writing Rector rules and dev docs for Joomla for the past three weeks and for many weeks to follow. Your point being...?

nikosdion commented 2 years ago

That's how you do it server-side without timing out, without adding stupid dependencies, in pure PHP code.

/**
 * Is the given file a Joomla full installation ZIP package?
 *
 * @param   string  $filePath  The file path to test
 *
 * @return  bool
 * @see     https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
 * @since   __DEPLOY_VERSION__
 */
function isFullPackage(string $filePath): bool
{
    // The file must exist and be readable
    if (!file_exists($filePath) || !is_readable($filePath)) {
        return false;
    }

    // The file must be at least 1KiB (anything less is not even a real file!)
    $filesize = filesize($filePath);

    if ($filesize < 1024) {
        return false;
    }

    // Open the file
    $fp = @fopen($filePath, 'r');

    if ($fp === false) {
        return false;
    }

    // Find its last Megabyte
    if ($filesize > 1048576) {
        fseek($fp, 0, SEEK_END);
        fseek($fp, -1048576, SEEK_CUR);
    }

    // Read the last part of the file (up to 1MiB in size) and close the file
    $lastMiB = fread($fp, min(1048576, $filesize));

    @fclose($fp);

    // Make sure we actually read something
    if ($lastMiB === false) {
        return false;
    }

    $offset          = 0;
    $headerSignature = pack('V', 0x02014b50);
    $sizeSignature   = pack('v', 0x0016);

    // Look for installation/index.php
    while (true) {
        // Try to find the literal string (may or may not be in a ZIP central directory header)
        $pos = strpos($lastMiB, 'installation/index.php', $offset);

        // Nothing found, definitely not an installation ZIP file
        if ($pos === false) {
            return false;
        }

        // Pre-emptively increase the offset in case we have a false positive.
        $offset = $pos + 22;

        // Make sure this entry was inside a ZIP central directory header
        if (substr($lastMiB, $pos - 46, 4) !== $headerSignature) {
            continue;
        }

        // Make sure that the file name was exactly 22 bytes long
        if (substr($lastMiB, $pos - 18, 2) !== $sizeSignature) {
            continue;
        }

        return true;
    }
}

This can be added to the upload handler.

We check that the uploaded file exists, it's readable and is at least 1KiB big — anything below that wouldn't be worth using the update for and most likely a mistake.

Then we read the last 1MiB of the file (or less, if the file is shorter). Joomla's update packages have less than 10,000 files at up to 100 bytes each which means that we are GUARANTEED to have the entire ZIP Central Directory structure loaded in that chunk.

We look for the string 'installation/index.php'. Just in case we're unlucky and we got that string in the file data of one of the last files included in the archive we check that the bytes 46 to 42 bytes before that match are the Central Directory Header magic signature (0x02014b50) and that the the bytes 18 and 17 bytes before that match (file size, little-endian encoded) is 0x016 (22 in decimal, that's the length of installation/index.php). If that's not the case we have a false match, we advance the offset by 22 bytes (the length of the false match) and try again.

If there is no match then by definition the installation/index.php file does not exist, therefore this is not an update file.

It's so bloody simple. Just use your intellect. Not everything has to be someone else's code and no, checking filenames is definitely not the way to go.

🫳🏽🎤

nikosdion commented 2 years ago

As for the performance, it takes 0.11 milliseconds on average to run this code versus 4 milliseconds to use ZipArchive. This code is pure PHP, ZipArchive requires a PHP extension we are not guaranteed to have installed (and will use as much memory as the archive in older PHP versions, but that's another discussion altogether).

The JavaScript library took nearly 2 seconds on the same machine and relies on an external dependency which does far more than just tell us if the ZIP archive is a Joomla installation package or not — with everything that entails about its potential vulnerability surface area.

Do you want me to make a PR or y'all have got it from here…?

dgrammatiko commented 2 years ago

The JavaScript library took nearly 2 seconds on the same machine and relies on an external dependency which does far more than just tell us if the ZIP archive is a Joomla installation package or not — with everything that entails about its potential vulnerability surface area.

The js version did NOT TRANSFER the whole .zip file from the user's computer to the server just to check the existence of a file and then just delete it. Efficiency also counts besides the ms and by this the code that requires an actual transfer of many MBs from a computer to the server just to check if it is valid is pure overhead and extremely wasteful of resources. Also did you count the time for the upload of the .zip file when you compare it to the complete solution of the JS version? (no)

You don't want to rely on a dependency that has many more d/ls than Joomla? Fine, write your own JS version. The PHP approach is wrong for this particular case...

nikosdion commented 2 years ago

First of all, the vast majority of users of this feature do use it the right way. What we are trying to do is prevent the people who did not read the info box carefully from accidentally using an installation instead of an upgrade package, bricking their site.

(Side note: this is NOT the only server-side protection we would need to add. We could also add the extraction filter to ensure that even if somehow, despite all efforts, an installation package DID still make it through it would not brick the site).

Since this is a minority of users and they are most definitely doing something wrong it's not a massive waste of resources to let the upload go through (at least up to the point where PHP has it in its temporary directory so we can run this code).

The running time of the code matters. I already said that whatever solution we come up with must NOT prevent the user from running the update using Upload & Update (it's the last resort of any user who can't use CLI, i.e. the majority!). Note that I am testing it on a very fast M2 processor. You can expect the average real world server to take 10x as long. That is to say, we'd still be in the 1 millisecond territory and still under 2MiB of memory usage, therefore we WILL NOT be hitting any CPU usage limits, maximum execution time limits, or memory limits.

Moreover, it's important that it's only userland (pure PHP) code, without relying on third party extensions. Right now, Joomla does NOT require the PHP Zip extension, we are not guaranteed it's installed and we cannot make it a dependency. The fact that ZIPArchive would be slow enough to possibly cause problems is just the icing on the cake.

Furthermore, relying entirely on client-side code for not bricking a user's site is a recipe for disaster. What if the JavaScript does not run? A third party plugin breaking JavaScript on the page would end up NOT protecting our user from themselves as the upload check would never get to run but the form would very likely submit. There are plenty of ways this could happen. You can't bet the farm just on client side code.

Even if we were to determine that the majority of users are using the Upload & Update feature wrong we'd have to rethink how this entire feature works. One way would be checking the filename client-side (cheap, easy, does not use third party libraries), on top of checking the file content server-side (as a failsafe, since the client side is NOT under our control or guaranteed to run!). If we could also add the client-side check of the package contents so much the better — I'll come back to that.

In no circumstances do we actually need to load a client-side library which can fully parse the ZIP format and has ZIP read and write features. Again, and as I said above, this would unacceptably increase the vulnerability surface area of Joomla Update for no good reason (we really don't need it when we can just do a simple string check!). To give you a simple example, what if there's a vulnerability which allows a malformed ZIP file to execute arbitrary code which results in the ZIP file being uploaded to be overwritten with malware? Trick a Super User into using that malicious ZIP file and their site is hacked, without any obvious indication that the ZIP file itself is tainted (they'd have to resort to a security analyst to fid out how they were hacked which is beyond what most of Joomla's users could afford or even think of doing). What if it's “just” arbitrary code execution which steals the Super User's session cookies? And what happens if this vulnerability is not fixed because the library is abandoned? Are we supposed to maintain this complicated library which does far more than we need? It's a bad proposition because we don't really need that entire library to determine if the package is an installation package or not.

Also note that I have stated many times that I am NOT a JavaScript developer. I can write some JavaScript code but I am NOT a JavaScript developer. I wouldn't know where to start with this and I don't have the time to learn something I am not going to use on my job which pays for the time I spend on Joomla. You are a JS developer. Now that you have the simpler solution using simple string search matching you can, if you want, translate this solution to JavaScript yourself. If you could, it'd be great! We could have the same solution running first client-side, before the upload takes place and if there is no problem with JS execution on the browser, and once again on the server-side to make sure that we really do catch problematic uploads before they become insurmountable problems for the user.

So, to sum it up, you have it backwards. The PHP solution is absolutely needed as a failsafe — we can't rely on client–side code to prevent bricking the site. The client–side code is a good to have but not necessary; we could just do a simple filename check on the client–side which would prevent the vast majority of problems from the minority of the problem users of Upload & Update who are, in themselves, a stark minority of the people updating Joomla.

brianteeman commented 2 years ago

@HLeithner are you able to action a change on the downloads to add the updates to the /latest page

nikosdion commented 2 years ago

Meanwhile I just started working on a PR to add redundant client- and server-side checks to prevent the issue Brian reported.

brianteeman commented 2 years ago

Thank you.

nikosdion commented 2 years ago

No problem! I actually figured out how to translate my PHP function to JavaScript, it takes fractions of a millisecond and does not rely on external libraries. Imagine that, being able to write code which does not use libraries. Kids these days and their darned libraries...

Let me polish this off and write test instructions after dinner and I'll post back here with the PR.

nikosdion commented 2 years ago

Here's the PR https://github.com/joomla/joomla-cms/pull/38707

Have a great evening, folks!

richard67 commented 2 years ago

The check if we have a full installation package or an update package is simple: The full package contains an installation folder, the update package doesn’t.

richard67 commented 2 years ago

Ah I just see the PR does it already this way.

nikosdion commented 2 years ago

Yup! The trick here is how to check which requires knowing the ZIP format and the quirks of various archivers.

It's not necessary that the ZIP file will contain a Local File entry for the installation directory. However, it will have entries for the files in the installation folder. That's why I check for the installation/index.php which is a file we can never remove from the Full Package as long as we are using installation as our installation directory.

The rest of the code is just taking advantage of the fact that ZIP files have a Central Directory after the last Local File entry, knowing that these entries in Joomla are approximately 100 bytes on average, knowing that there are less than 10,000 of them therefore they will be found in the last Megabyte of the ZIP archive and just doing a simple string search and two small binary string comparisons to ensure that the Central Directory record for the installation/index.php file exists. That's the black magic bit and I tried to document it in the code so people don't go “WTF is that, let's remove it” six years later or something.

alikon commented 2 years ago

please test #38707

richard67 commented 2 years ago

Re-opening this issue since PR #38707 was just closed.

brianteeman commented 1 year ago

can you please re-open this. Even if #38707 is accepted it does not address everything I raised in the initial post.

brianteeman commented 1 year ago

@HLeithner are you able to uopdate the downloads site so that /latest goes to the page with all the files (https://downloads.joomla.org/cms/joomla4/4-2-6) andn not to the page that doesnt contain the upgrade files.

This is the real root of the problem here and why people use the wrong file.

If you say that the download site can not be updated as required then I will look further into how to write a better message for the user.

HLeithner commented 1 year ago

The https://downloads.joomla.org/latest and https://joomla.org/latest are redirects for new installation, updates should be done with the web updater...

for downloads.joomla.org/latest it's a com_ars virtual group view, not sure if this could be changed that also the upgrade packages can be shown

richard67 commented 1 year ago

updates should be done with the web updater...

@HLeithner That doesn't work for everybody. Some people e.g. in an intranet have to use upload and update method of the update component and so need to download the update packages. When they use the full installation package for that, they run into the issue reported here.

crystalenka commented 1 year ago

Web updater also can time out if you are on an unstable or poor internet connection. Downloading directly can sometimes be faster, or at least not entirely time out.

brianteeman commented 1 year ago

@HLeithner you are missing the problerm reported in the first post

image

but don't worry I can fix the url in the string easily enough so that it goes to the correct url easily enough - problem solved.

PR will be sent as soon as I have had a coffee break

HLeithner commented 1 year ago

Yes I have seen the original issue, we have a pr for most of this stuff by Nic.

brianteeman commented 1 year ago

and that solves none of it

brianteeman commented 1 year ago

please test #39564 as it addresses the reported issue without all the "whatif" scenarios