webdevops / TYPO3-metaseo

TYPO3 MetaSEO Extension
https://typo3.org/extensions/repository/view/metaseo
GNU General Public License v3.0
38 stars 24 forks source link

8LTS: Use subproperties for additionalHeaders #478

Closed ghost closed 7 years ago

ghost commented 7 years ago

MetaSEO version: 3.0.0-dev TYPO3 version: 8.7.1 PHP version: 7.1 RealUrl version (optional): 2.2.1

Hey i have a problem within the google webmaster tools. The error says: Your sitemap is obviously an HTML page. Please use a supported format for Sitemaps instead.

Doublechecked the realURL configuration and for me all is fine... The sitemap can be found via: www.netfactory.de/sitemap.xml

thomaszbz commented 7 years ago

A sitemap generated by metaseo (metaseo 2.0.4, TYPO3 7.6.18) should start with :

<?xml version="1.0" encoding="UTF-8"?><sitemapindex xmlns=

The sitemap you provided starts with

<?xml version="1.0" encoding="UTF-8"?><head/><sitemapindex xmlns=

... which in turn means that the <head/> tag somehow got injected and should not be there.

Now that your sitemap.xml eventually is stored somewhere and therefore could be outdated or originate somewhere else, could you please check the sitemap provided in metaseo's backend section? Should be something like http://www.netfactory.de/?id=X&type=841132 with X for your root page id.

I tried to reproduce this against TYPO3 8.7.1, PHP 7.0.16/debian9/apache2.4.25, latest metaseo 3.0.0-dev, but could not reproduce the head tag. For me, the output was as expected:

<?xml version="1.0" encoding="UTF-8"?><sitemapindex xmlns=

The relevant code in metaseo 3.0.0-dev is

    $ret = '<?xml version="1.0" encoding="UTF-8"?>';
    $ret .= '<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" '

https://github.com/mblaschke/TYPO3-metaseo/blob/feature/8lts/Classes/Sitemap/Generator/XmlGenerator.php#L76-L77

That said, I see no way metaseo could inject the head tag at this place.

Could it be that you have some piece of software running that somehow tries to heal html syntax? Maybe you want to let that software know that your sitemap.xml is not HTML output. For the moment, you are using the http header

Content-Type: text/html; charset=utf-8

Better would be

Content-Type: application/xml;charset=UTF-8

Something else to look at would be brotli compression which maintains a set of frequently used tags without having to transmit them in plain text over the wire. The algorithm is relatively new, and chances are that you are using an outdated buggy version in ubuntu. What happens if you disable brotli and/or use the correct Content-Type?

thomaszbz commented 7 years ago

Content-Type is defined in https://github.com/mblaschke/TYPO3-metaseo/blob/feature/8lts/Configuration/TypoScript/setup.txt#L556

additionalHeaders = Content-type: application/xml;charset=UTF-8 | X-Robots-Tag: noindex

That works great in TYPO3 7.6. In 8.7.1 LTS however, I get a

Content-Type: text/html; charset=utf-8

That said, I think that we ran into a breaking change:

https://docs.typo3.org/typo3cms/extensions/core/8.7/Changelog/8.0/Breaking-72424-RemovedDeprecatedTypoScriptFrontendControllerOptionsAndMethods.html

The TypoScript property config.additionalHeaders has been removed.

Migration Use the config.additionalHeaders subproperties (see https://docs.typo3.org/typo3cms/TyposcriptReference/Setup/Config/Index.html#additionalheaders for details) to add the additional header lines.

thomaszbz commented 7 years ago

@MKchn Please update from branch feature/8lts and see if the http header for the content type is right (F12 key in your browser) and fixes this issue (injected <head/> tag is not injected any more). As said, I can't reproduce the latter, neither before nor after the patch is applied.

Please report back - maybe someone else is affected by this issue.

ghost commented 7 years ago

Works fine after updating. No more head-Tag in the sitemap.

thomaszbz commented 7 years ago

Headers also look as expected now:

auswahl_317