webdevops / TYPO3-metaseo

TYPO3 MetaSEO Extension
https://typo3.org/extensions/repository/view/metaseo
GNU General Public License v3.0
38 stars 25 forks source link

GET-param "type" twice in url robots.txt and sitemap.xml #497

Open derBoogie opened 7 years ago

derBoogie commented 7 years ago

robots.txt and sitemap.xml url generates urls with double get-paramter "type" on my website:

http://www.jh-moehringen.de/robots.txt -> http://www.jh-moehringen.de/index.php?id=3&type=841133&type=841133 http://www.jh-moehringen.de/sitemap.xml ->http://www.jh-moehringen.de/index.php?id=3&type=841132&type=841132

realurl config:

...
'fileName' => array (      
    'index' => array (
        ...
        'sitemap.xml' => array (
            'keyValues' => array (
                'type' => 841132,
            ),
        ),
        'sitemap.txt' => array (
            'keyValues' => array (
                'type' => 841131,
            ),
        ),
        'robots.txt' => array (
            'keyValues' => array (
                'type' => 841133,
            ),
        ),
    ),
),
...

MetaSEO version: 3.0.0 TYPO3 version: 8.7.2 PHP version: 7.1.6 RealUrl version: 2.2.1

thomaszbz commented 7 years ago

Here's how wget sees it:

wget http://www.jh-moehringen.de/robots.txt
--2017-07-04 16:55:37--  http://www.jh-moehringen.de/robots.txt
Auflösen des Hostnamen »www.jh-moehringen.de (www.jh-moehringen.de)«... 46.30.61.6, 2a03:2a00:1200:0:1::3498
Verbindungsaufbau zu www.jh-moehringen.de (www.jh-moehringen.de)|46.30.61.6|:80... verbunden.
HTTP-Anforderung gesendet, warte auf Antwort... 307 Temporary Redirect
Platz: http://www.jh-moehringen.de/index.php?id=3&type=841133&type=841133 [folge]
--2017-07-04 16:55:38--  http://www.jh-moehringen.de/index.php?id=3&type=841133&type=841133
Wiederverwendung der bestehenden Verbindung zu www.jh-moehringen.de:80.
HTTP-Anforderung gesendet, warte auf Antwort... 200 OK
Länge: 160 [text/plain]
thomaszbz commented 7 years ago

@derBoogie Well, your config looks nice in principle, just that it's the config for realurl. The code in question should be somewhere around here: https://github.com/dmitryd/typo3-realurl/blob/952499ff5b21965d7eef8751213981054c2e8fc8/Classes/Encoder/UrlEncoder.php#L1066

What should we do about it on the side of MetaSEO?

MetaSEO does not process that config, so duplication is likely to happen in realurl or somewhere else. The 307 Temporary Redirect presumably is initiated by realurl. All that goes wrong long before MetaSEO comes in touch with a http request.

@derBoogie if you file(d) a bug report for real url we could add a link here and just wait.

derBoogie commented 7 years ago

@thomaszbz I'm using the standard typoscript configuration of metaseo, nothing else is defined in in my custom typoscript. metaseo is generating the urls http://www.jh-moehringen.de/index.php?id=3&type=841133&type=841133 and http://www.jh-moehringen.de/index.php?id=3&type=841132&type=841132 right? Why should this be a realurl theme?

thomaszbz commented 7 years ago

Mainly because realurl generates these URLs, not Metaseo. As far as I see, the configuration itself is right. As said, I think this is a bug in realurl, presumably.

thomaszbz commented 7 years ago

@derBoogie Seems as if you filed a new bug report against realurl.

Just in case: If the config is somehow wrong and needs to be changed, we can fix that in MetaSEO's documentation, of course.

I'll leave this open for some time to let it serve as a bug-watcher.

dmitryd commented 7 years ago

RealURL may not generate anything like http://www.jh-moehringen.de/index.php?id=3&type=841132&type=841132.

thomaszbz commented 7 years ago

@dmitryd Then, where does it come from? The core?

dmitryd commented 7 years ago

I don't know. But it is not a RealURL redirect. Here is how it looks like:

MBP3:~ $ curl -I http://www.jh-moehringen.de/robots.txt
HTTP/1.1 307 Temporary Redirect
Date: Thu, 06 Jul 2017 14:12:43 GMT
Server: Apache
Location: http://www.jh-moehringen.de/index.php?id=3&type=841133&type=841133
Vary: Accept-Encoding
Content-Type: text/html; charset=UTF-8

If it was from RealURL, than Location would never contain any index.php. RealURL redirects only in few cases and always to speaking URLs.

thomaszbz commented 7 years ago

I filed an issue against TYPO3 core.