Open pocallaghan opened 4 years ago
Hi @pocallaghan. Thank you for your report. To help us process this issue please make sure that you provided the following information:
Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:
@magento give me 2.4-develop instance
- upcoming 2.4.x release
For more details, please, review the Magento Contributor Assistant documentation.
Please, add a comment to assign the issue: @magento I am working on this
Join Magento Community Engineering Slack and ask your questions in #github channel.
:warning: According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.
:clock10: You can find the schedule on the Magento Community Calendar page.
:telephone_receiver: The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, please join the Community Contributions Triage session to discuss the appropriate ticket.
:movie_camera: You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel
:pencil2: Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel
Hi, is there any update? Noticed a similar issue on Magento Cloud, sometimes CSS or JS merged files are corrupted, no strict steps to repeat. Thanks!
We have same issue, our Magento is running on multiple servers environment with much concurrent traffic. The problem will happen with both JS and CSS merge.
Hi @pocallaghan , Thank you for reporting and collaboration! We have tried to reproduce this issue in Magento 2.4-develop branch
We have followed the below steps: CSS Merging Enabled (Store > Configuration > Advanced > Developer > CSS Settings > Merge CSS Files). Delete the merged files from pub/static/_cache/merged/ Hammer the website Note:By hitting this command (Php bin/magento indexer:reindex && php bin/magento cache:flush) after deleting CSS files, the deleted files are returned Do we miss anything?
@engcom-November I think it is really hard to reproduce these issues however this is a real issue for many people especially with setup using multiple servers.
Yeah, I think this one would be hard to reproduce unfortunately, but it definitely is still happening for me (even after applying this patch: https://github.com/magento/magento2/pull/29173 )
We're running different (web) nodes inside containers spread over a number of hosts that are using a shared filesystem.
I have managed to circumvent the issue 99.9% of the time by running a cache warmer straight after the deployment with a very low concurrency. If I increase this concurrency to a very high amount, a bad JS bundle will almost definitely appear after a few deployments.
I'm afraid it would be hard to reproduce on local environments are you're dealing with a physical filesystem and running everything on the host.
Yeah, I think this one would be hard to reproduce unfortunately, but it definitely is still happening for me (even after applying this patch: #29173 )
We're running different (web) nodes inside containers spread over a number of hosts that are using a shared filesystem.
I have managed to circumvent the issue 99.9% of the time by running a cache warmer straight after the deployment with a very low concurrency. If I increase this concurrency to a very high amount, a bad JS bundle will almost definitely appear after a few deployments.
I'm afraid it would be hard to reproduce on local environments are you're dealing with a physical filesystem and running everything on the host.
By the way, do you use CDN? In my case, I do use CDN and the problem seems to be with the completed merged file, it happens with certain cached location on CDN only.
@onlinebizsoft we do use a CDN, but I'm not sure how they would be related — the corrupt JS file exists on the origin host and the CDN will just retrieve it and cache it.
@erfanimani I would like to revise my information : Yes, today the problem happened again and we found that the merged JS file is corrupted on our server as well (and look like files always missed the requirejs content part). We tracked all the log which is around this timestamp but look like there is any kind of failed message
@pocallaghan I think you should edit the issue title because the problem also happen with both merged CSS and JS files
@pocallaghan do you have any explanation why the completed JS/CSS files is corrupt on the origin server? And please keep in mind that in my case, the corrupt part is on the top of the file, our merged dir is on a NFS partition and get shared over all servers.
So it can't be the case where the rename process get broken and write incomplete content.
I turned the b293437 commit linked to above into a patch and applied it to our Adobe Commerce 2.4.5-p5 installation running on Adobe Commerce Cloud. We ran into this issue twice last week, where a deployment during a busy time on the site would result in broken merged CSS. After doing two deployments tonight, the issue did not happen again. So hopefully that commit fixed the issue.
For the record, on the ~7th production deployment since applying the patch, we encountered the issue again. So it appears that the patch didn't fix the issue.
I ended up adding a custom script to the .magento.app.yaml
file to preload the CSS. Hopefully this will work.
hooks:
…
post_deploy: |
…
php ./bin/warm_css.php
The contents of ./bin/warm_css.php
:
<?php
/**
* Load merged CSS before site is accessible. See REDACTED for context
*/
try {
echo 'Loading CSS file 5 times to ensure it is properly generated, before web traffic can access it' . PHP_EOL;
$environment = getenv('MAGENTO_CLOUD_ENVIRONMENT_TYPE');
if ($environment) {
$url = 'https://stage-qejtaqcekwe74.example.com/';
} else {
$url = 'https://example.com/';
}
// Function to fetch URL content
function fetchUrl($url) {
$ch = curl_init();
// Don't wait more than 10 seconds for page to load
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10000);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
// Retrieve the total response time for the request
$responseTime = curl_getinfo($ch, CURLINFO_TOTAL_TIME);
echo "Total response time for $url: {$responseTime} seconds.\n";
curl_close($ch);
return $data;
}
// Fetch the HTML content of the main page
$htmlContent = fetchUrl($url);
// Use regex to find the CSS file URL
if (preg_match('/<link rel="stylesheet" type="text\/css" media="all" href="(https:\/\/.*example\.com\/static\/version[0-9]+\/_cache\/merged\/[a-z0-9]+\.min\.css)" \/>/', $htmlContent, $matches)) {
$cssUrl = $matches[1]; // Extract CSS URL
echo "Found CSS URL: $cssUrl\n";
// Load the CSS URL 5 times with a 0.25 second delay
for ($i = 0; $i < 5; $i++) {
fetchUrl($cssUrl);
echo "Loaded $cssUrl iteration " . ($i + 1) . "\n";
usleep(250000); // Delay for 0.25 second
}
} else {
echo "CSS URL not found. First 6K characters of HTML for debugging purposes: " . substr($htmlContent, 0, 60000);
}
} catch (\Throwable $e) {
echo $e->getMessage();
}
@ihor-sviziev can you get this re-opened because the issue is still valid after the patch?
I reopened issue. So far the workaround is disabling css merging. So far I didn't notice significant slowdown related to this change thanks to http2 usage. I think this can be treated as a workaround. Also, questionable if this feature still needed? Maybe worth just deprecating it and not fix at all?
@ihor-sviziev I don't think not merging and htt2 is good in all cases, look like you are good with http2 when you don't have too many small files to load but when you have many then few files to load are still better.
We got this problem too this week after deploying to production :disappointed: . We are using fastly as CDN. So once the content is cached it's for days. Just flushing the cache from fastly restored the situation.
Are we sure the generated files are moved atomicly ?
in vendor/magento/framework/Filesystem/Driver/File.php
in the rename()
function, I'm seeing a fallback to read and copy if the driver classes from src/dst are not the same.
Unfortunately I do not have more time currently to investigate.
I ended up writing a bash script in order to check fastly content.
#!/bin/sh
cd ~/magento_root_dir
version=$(cat pub/static/deployed_version.txt)
for i in $(ls pub/static/_cache/merged/); do
#echo "$i"
url="https://cdnfastly.example.com/static/version$version/_cache/merged/$i"
#echo "$url"
if curl --compressed -s "$url" | tail -1 | tr -d '[:space:]' | grep -q '}$'; then
:
else
echo "$i might be corrupt"
echo "$url"
fi
done
Note that due to the nature of this issue, there's little chance that this can be replicated by somebody attempting to triage this ticket, I'm raising purely so I have something to reference with a pull request, so an engineer can review.
Preconditions (*)
var/tmp
andpub/static
being on different filesystemsSteps to reproduce (*)
pub/static/_cache/merged/
.Expected result (*)
Actual result (*)
Work was previously done to improve this issue in commit 1b46c966d416ec27ca0891be654b69d7345da2d6, there is however an edge case. The previous commit generates randomly named files, so that two concurrent generation requests don't start writing to the same file, causing one request to copy the partly generated file from the other request. This is a good idea, as each concurrent generation of the file creates its own file. Once generated, the file is moved from the temporary location to the target location using a rename command.
The rename command is normally atomic, meaning the target file is never a partially complete file. The edge case is that if the temp directory and the static directory are not on the same filesystem, the rename command is no longer atomic. This means that during the rename from the temporary path to the static directory, the file can be served partially complete and cached in the CDN / browser cache.
The only solution I'm aware of is to have the temporary file created in the same directory as the target path. Pull request to follow.
Please provide Severity assessment for the Issue as Reporter. This information will help during Confirmation and Issue triage processes.