Closed crstauf closed 4 years ago
Hi
Can you help me to understand a bit "PHP files (within the theme) are created"?
Is your theme loading dynamic assets with .php extension at Development site which you're trying to export as Static?
Thanks
Is your theme loading dynamic assets with .php extension
@thegulshankumar No: they are the same file names as the template files used by theme.
Thanks for reporting, @crstauf - I think this is similar to some other issues which have crept in, missing the integration tests.
The admin CSS files may continue to be included, but I may try a simple filter of admin
against plugin CSS files.
This is due to the 2-pass crawling that it currently does, where we need any potentially used CSS files included in the initial crawl list, in order to parse/follow URLs to assets from within those.
I'm going to try and clean that area up for this upcoming v6.6.x release on wp.org, but will err on the side of detecting too much than not enough.
The issue reported here will definitely be solved though, no PHP files should be getting crawled.
@crstauf this should fix the unwanted assets / php files being detected, it also has an unrelated adjustment for pagination URLs. https://github.com/WP2Static/static-html-output-plugin/issues/80 will still be affected in this build until it's fixed.
@crstauf - I've updated since last comment, this should be good to test:
@leonstafford Installed the provided plugin, and got this error:
[05-Jun-2020 18:56:31 UTC] PHP Notice: Undefined offset: 1 in /app/public/wp-content/plugins/static-html-output-plugin/src/DetectArchiveURLs.php on line 72
[05-Jun-2020 18:56:31 UTC] PHP Stack trace:
[05-Jun-2020 18:56:31 UTC] PHP 1. {main}() /app/public/wp-admin/admin-ajax.php:0
[05-Jun-2020 18:56:31 UTC] PHP 2. do_action() /app/public/wp-admin/admin-ajax.php:175
[05-Jun-2020 18:56:31 UTC] PHP 3. WP_Hook->do_action() /app/public/wp-includes/plugin.php:478
[05-Jun-2020 18:56:31 UTC] PHP 4. WP_Hook->apply_filters() /app/public/wp-includes/class-wp-hook.php:311
[05-Jun-2020 18:56:31 UTC] PHP 5. static_html_output_ajax() /app/public/wp-includes/class-wp-hook.php:287
[05-Jun-2020 18:56:31 UTC] PHP 6. StaticHTMLOutput\Controller->generate_filelist_preview() /app/public/wp-content/plugins/static-html-output-plugin/static-html-output-plugin.php:77
[05-Jun-2020 18:56:31 UTC] PHP 7. StaticHTMLOutput\FilesHelper::buildInitialFileList() /app/public/wp-content/plugins/static-html-output-plugin/src/Controller.php:204
[05-Jun-2020 18:56:31 UTC] PHP 8. StaticHTMLOutput\FilesHelper::getDateArchiveURLs() /app/public/wp-content/plugins/static-html-output-plugin/src/FilesHelper.php:514
[05-Jun-2020 18:56:31 UTC] PHP 9. StaticHTMLOutput\DetectArchiveURLs::detect() /app/public/wp-content/plugins/static-html-output-plugin/src/FilesHelper.php:548
Additionally, while older archive pages were picked up (that hadn't been before!), CSS and SVG files from both Yoast SEO and Query Monitor continue to be picked up (no PHP files from my theme though).
@crstauf sorry about that, hopefully this removes the notice: static-html-output-plugin-81-guard-empty-cat.zip
So the Yoast and Query Monitor ones don't break the site, they're just unnecessarily included, right?
I'm thinking to just extend a blacklist of plugin/theme admin asset paths. Not an amazing solution, but can quickly eliminate those from the most popular plugins.
@leonstafford If I may ask, how does the plugin determine which assets are needed?
@crstauf I've just made some changes this week to try and simplify things.
For this V6, in src/FilesHelper.php
, we do the initial detection of all WP paths (posts, pages, pagination), along with some common ones, like favicon, robots.txt, sitemaps (kinda working). It used to grab everything from the wp-contents/uploads
dir and still has a few extra detections for things like a custom permalinks plugin and Autoptimize's cache dir (shouldn't be needed anymore). Now, it will only do the usual WP paths and grab all CSS files from plugins and themes.
The plugin does only 2 passes when crawling. During the first phase of crawling, it will detect any assets linked to from one of those initially detected paths and crawl those on the second pass. In theory, this should pick everything up. Recent work has fixed the way it parses the HTML and CSS files to detect more linked assets.
The pros are that we can see what we're going to crawl, based on the Initial crawl list and optionally modify that list with actions/filters. It may also lead to some quick performance boosts soon, being able to more easily spawn concurrent crawlers with divided batches of detected URLs to start with.
The downsides are if we fail to detect certain URLs.
SimplerStatic's approach is to crawl more like a normal spider, continually building a list of new pages to crawl as it goes. That's a lot more helpful with things like pagination URLs, where detecting those can be a pain.
We'll see how things develop, probably some hybrid approach will eventuate. If this V6 can detect all URLs for majority of sites and allow custom filters/inclusion rules to workaround for the % that it doesn't work for, that would be a good position for this plugin for now.
@leonstafford That being the case, how/why would it detect a CSS file that's used only in the admin?
@crstauf
grab all CSS files from plugins and themes
It's getting these from the filesystem. WordPress doesn't enforce any structure for where CSS files go for admin or public, they are usually served to the user via code, which can pull them from any path(s) on filesystem and serve them at any URL.
So, as a human, we can't even determine which ones are admin or public CSS files just by looking within the plugin. We can guess, based on if the directory is called "admin", "public", "dist", etc, but there's sure to be some out there mixing it up, plus all the non-English or typos.
We could possibly listen to the wp_enqueue_styles()
as we crawl the whole site on first pass, to detect just what's being loaded, but it's a bit messy and slow.
Alternatively, we could not detect any CSS files for initial crawl list and add a third round of crawling. That might be the sanest approach for V6 right now: only detect HTML/XML/special files during pre-crawling detection; detect all other HTML/XML/CSS/imgs, etc files on 2nd pass and 3rd pass for all the assets detected within the 2nd pass crawl. That should be enough to cover all assets and remove the need for the greedy detection of CSS initially
Just like robots.txt, please include ads.txt by default. I see many users need.
@thegulshankumar I've put ads.txt in now, will be in next build.
Closing this issue, as recent detection / crawling changes should have sorted it
When generating the static files, PHP files (within the theme) are created, but only with error output (there's no error when viewing the template).
There are various types of errors:
Call to undefined function get_header()
Call to undefined function query_posts()
Call to undefined function locate_template()
Undefined variable $wp_query
Call to undefined function wp_style_is()
Additionally, some admin CSS files for plugins are also being included in the static files.
Discovered with v6.6.16; tried with
master
, but autoloader isn't working...?