magento / magento2

Prior to making any Submission(s), you must sign an Adobe Contributor License Agreement, available here at: https://opensource.adobe.com/cla.html. All Submissions you make to Adobe Inc. and its affiliates, assigns and subsidiaries (collectively “Adobe”) are subject to the terms of the Adobe Contributor License Agreement.
http://www.magento.com
Open Software License 3.0
11.55k stars 9.32k forks source link

Corrupt Merged CSS Files #29172

Open pocallaghan opened 4 years ago

pocallaghan commented 4 years ago

Note that due to the nature of this issue, there's little chance that this can be replicated by somebody attempting to triage this ticket, I'm raising purely so I have something to reference with a pull request, so an engineer can review.

Preconditions (*)

  1. CSS Merging Enabled (Store > Configuration > Advanced > Developer > CSS Settings > Merge CSS Files).
  2. var/tmp and pub/static being on different filesystems

Steps to reproduce (*)

  1. Delete the merged files from pub/static/_cache/merged/.
  2. Hammer the website with concurrent traffic.

Expected result (*)

  1. Website loads, looking as it should.

Actual result (*)

  1. It's possible that the merged file is incomplete, entirely breaking the design of the site until such time as the merged file is removed or a change results in a different filename. The effected pages are at best, very ugly and at worse, entirely non-usable, as elements you need to interact with may overlap, etc.

Work was previously done to improve this issue in commit 1b46c966d416ec27ca0891be654b69d7345da2d6, there is however an edge case. The previous commit generates randomly named files, so that two concurrent generation requests don't start writing to the same file, causing one request to copy the partly generated file from the other request. This is a good idea, as each concurrent generation of the file creates its own file. Once generated, the file is moved from the temporary location to the target location using a rename command.

The rename command is normally atomic, meaning the target file is never a partially complete file. The edge case is that if the temp directory and the static directory are not on the same filesystem, the rename command is no longer atomic. This means that during the rename from the temporary path to the static directory, the file can be served partially complete and cached in the CDN / browser cache.

The only solution I'm aware of is to have the temporary file created in the same directory as the target path. Pull request to follow.


Please provide Severity assessment for the Issue as Reporter. This information will help during Confirmation and Issue triage processes.

m2-assistant[bot] commented 4 years ago

Hi @pocallaghan. Thank you for your report. To help us process this issue please make sure that you provided the following information:

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento give me 2.4-develop instance - upcoming 2.4.x release

For more details, please, review the Magento Contributor Assistant documentation.

Please, add a comment to assign the issue: @magento I am working on this


:clock10: You can find the schedule on the Magento Community Calendar page.

:telephone_receiver: The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, please join the Community Contributions Triage session to discuss the appropriate ticket.

:movie_camera: You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

:pencil2: Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

HellgaM commented 2 years ago

Hi, is there any update? Noticed a similar issue on Magento Cloud, sometimes CSS or JS merged files are corrupted, no strict steps to repeat. Thanks!

onlinebizsoft commented 1 year ago

We have same issue, our Magento is running on multiple servers environment with much concurrent traffic. The problem will happen with both JS and CSS merge.

engcom-November commented 1 year ago

Hi @pocallaghan , Thank you for reporting and collaboration! We have tried to reproduce this issue in Magento 2.4-develop branch

We have followed the below steps: CSS Merging Enabled (Store > Configuration > Advanced > Developer > CSS Settings > Merge CSS Files). Delete the merged files from pub/static/_cache/merged/ Hammer the website Note:By hitting this command  (Php bin/magento indexer:reindex && php bin/magento cache:flush) after deleting CSS files, the deleted files are returned Do we miss anything?

onlinebizsoft commented 1 year ago

@engcom-November I think it is really hard to reproduce these issues however this is a real issue for many people especially with setup using multiple servers.

erfanimani commented 1 year ago

Yeah, I think this one would be hard to reproduce unfortunately, but it definitely is still happening for me (even after applying this patch: https://github.com/magento/magento2/pull/29173 )

We're running different (web) nodes inside containers spread over a number of hosts that are using a shared filesystem.

I have managed to circumvent the issue 99.9% of the time by running a cache warmer straight after the deployment with a very low concurrency. If I increase this concurrency to a very high amount, a bad JS bundle will almost definitely appear after a few deployments.

I'm afraid it would be hard to reproduce on local environments are you're dealing with a physical filesystem and running everything on the host.

onlinebizsoft commented 1 year ago

Yeah, I think this one would be hard to reproduce unfortunately, but it definitely is still happening for me (even after applying this patch: #29173 )

We're running different (web) nodes inside containers spread over a number of hosts that are using a shared filesystem.

I have managed to circumvent the issue 99.9% of the time by running a cache warmer straight after the deployment with a very low concurrency. If I increase this concurrency to a very high amount, a bad JS bundle will almost definitely appear after a few deployments.

I'm afraid it would be hard to reproduce on local environments are you're dealing with a physical filesystem and running everything on the host.

By the way, do you use CDN? In my case, I do use CDN and the problem seems to be with the completed merged file, it happens with certain cached location on CDN only.

erfanimani commented 1 year ago

@onlinebizsoft we do use a CDN, but I'm not sure how they would be related — the corrupt JS file exists on the origin host and the CDN will just retrieve it and cache it.

onlinebizsoft commented 1 year ago

@erfanimani I would like to revise my information : Yes, today the problem happened again and we found that the merged JS file is corrupted on our server as well (and look like files always missed the requirejs content part). We tracked all the log which is around this timestamp but look like there is any kind of failed message

@pocallaghan I think you should edit the issue title because the problem also happen with both merged CSS and JS files

onlinebizsoft commented 1 year ago

@pocallaghan do you have any explanation why the completed JS/CSS files is corrupt on the origin server? And please keep in mind that in my case, the corrupt part is on the top of the file, our merged dir is on a NFS partition and get shared over all servers.

So it can't be the case where the rename process get broken and write incomplete content.

erikhansen commented 9 months ago

I turned the b293437 commit linked to above into a patch and applied it to our Adobe Commerce 2.4.5-p5 installation running on Adobe Commerce Cloud. We ran into this issue twice last week, where a deployment during a busy time on the site would result in broken merged CSS. After doing two deployments tonight, the issue did not happen again. So hopefully that commit fixed the issue.

erikhansen commented 8 months ago

For the record, on the ~7th production deployment since applying the patch, we encountered the issue again. So it appears that the patch didn't fix the issue.

erikhansen commented 8 months ago

I ended up adding a custom script to the .magento.app.yaml file to preload the CSS. Hopefully this will work.

hooks:
    …
    post_deploy: |
        …
        php ./bin/warm_css.php

The contents of ./bin/warm_css.php:

<?php
/**
 * Load merged CSS before site is accessible. See REDACTED for context
 */

try {
    echo 'Loading CSS file 5 times to ensure it is properly generated, before web traffic can access it' . PHP_EOL;

    $environment = getenv('MAGENTO_CLOUD_ENVIRONMENT_TYPE');
    if ($environment) {
        $url = 'https://stage-qejtaqcekwe74.example.com/';
    } else {
        $url = 'https://example.com/';
    }

    // Function to fetch URL content
    function fetchUrl($url) {
        $ch = curl_init();
        // Don't wait more than 10 seconds for page to load
        curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10000);
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        $data = curl_exec($ch);

        // Retrieve the total response time for the request
        $responseTime = curl_getinfo($ch, CURLINFO_TOTAL_TIME);
        echo "Total response time for $url: {$responseTime} seconds.\n";

        curl_close($ch);
        return $data;
    }

    // Fetch the HTML content of the main page
    $htmlContent = fetchUrl($url);

    // Use regex to find the CSS file URL
    if (preg_match('/<link rel="stylesheet" type="text\/css" media="all" href="(https:\/\/.*example\.com\/static\/version[0-9]+\/_cache\/merged\/[a-z0-9]+\.min\.css)" \/>/', $htmlContent, $matches)) {
        $cssUrl = $matches[1]; // Extract CSS URL
        echo "Found CSS URL: $cssUrl\n";

        // Load the CSS URL 5 times with a 0.25 second delay
        for ($i = 0; $i < 5; $i++) {
            fetchUrl($cssUrl);
            echo "Loaded $cssUrl iteration " . ($i + 1) . "\n";
            usleep(250000); // Delay for 0.25 second
        }
    } else {
        echo "CSS URL not found. First 6K characters of HTML for debugging purposes: " . substr($htmlContent, 0, 60000);
    }
} catch (\Throwable $e) {
    echo $e->getMessage();
}
onlinebizsoft commented 8 months ago

@ihor-sviziev can you get this re-opened because the issue is still valid after the patch?

ihor-sviziev commented 8 months ago

I reopened issue. So far the workaround is disabling css merging. So far I didn't notice significant slowdown related to this change thanks to http2 usage. I think this can be treated as a workaround. Also, questionable if this feature still needed? Maybe worth just deprecating it and not fix at all?

onlinebizsoft commented 8 months ago

@ihor-sviziev I don't think not merging and htt2 is good in all cases, look like you are good with http2 when you don't have too many small files to load but when you have many then few files to load are still better.

dooblem commented 7 months ago

We got this problem too this week after deploying to production :disappointed: . We are using fastly as CDN. So once the content is cached it's for days. Just flushing the cache from fastly restored the situation.

Are we sure the generated files are moved atomicly ? in vendor/magento/framework/Filesystem/Driver/File.php in the rename() function, I'm seeing a fallback to read and copy if the driver classes from src/dst are not the same.

Unfortunately I do not have more time currently to investigate.

I ended up writing a bash script in order to check fastly content.

#!/bin/sh

cd ~/magento_root_dir

version=$(cat pub/static/deployed_version.txt)

for i in $(ls pub/static/_cache/merged/); do
  #echo "$i"
  url="https://cdnfastly.example.com/static/version$version/_cache/merged/$i"
  #echo "$url"
  if curl --compressed -s "$url" | tail -1 | tr -d '[:space:]' | grep -q '}$'; then
    :
  else
    echo "$i might be corrupt"
    echo "$url"
  fi
done