Open KreativeKrise opened 8 years ago
Which version of metaseo, TYPO3 and PHP are you using? Do you have realurl enabled and which version do you use?
I get the idea: These parameters can lead to duplicate content (multiple URLs with same content). For instance, you can also exclude these parameters at e.g. google search console and e.g. piwik analytics (while the latter does not use metaseo's sitemap).
How exactly do you create these get parameters? Using an extension? Does TYPO3 know something about these parameters or do you just add them outside of the scope of TYPO3?
I need to reproduce this issue first, to see if it really happens and if it creates new entries with 2.0.0. Did you try this out already?
What happens with your canonical tag when you use such parameters?
Metaseo 1.0.8 TYPO3 6.2.25 PHP 5.4.41
Realurl is enabled.
These parameters are created by an external newsletter service. It adds the tracking parameters for Google Analytics to the links inside the newsletter.
I will try version 2.0.0 on Monday if it maybe fixes the problem! The canonical tag I have to check also on Monday because I don't use the functionality of metaseo. So I will response on Monday!
Have a great weekend :)
I'm not sure if 2.0.0 fixes the problem, but we fixed many many bugs since 1.0.8. 1.0.x is not supported any more (We still support 6.2 with 2.0.0).
But there's one regression which is not yet fixed in 2.0.0: #233 in case you have a full-blown hoster's environment, together with TYPO3 6.2. A fix is available in metaseo's develop branch, however.
I tried 2.0.0 and URLs with "wrong" GET-parameter generate an entry inside the sitemap.xml.
An option similar to metaTags.canonicalUrl.strict = 1
for the sitemap.xml would be great. Even better would be a whitelist for GET-parameters :)
What happens if you use a parameter which is new to TYPO3 (not configured somewhere)?
E.g. http://www.example.com/some-page.html?thisisnew=1
If I add a parameter like ?thisisnew=1
then it generates a new sitemap entry.
The canonical tag is fine, as long as I use metaTags.canonicalUrl.strict = 1
.
I've been able to reproduce this issue in the meantime, at least with parameters which are new to TYPO3. The entries originate from a live instance with metaseo 2.0.0, TYPO3 >= 6.2.26, before upgrading to 6.2.27
This is misbehaviour in respect to SEO. Maybe there is a way to mitigate this via configuration. If not, it could still be considered buggy somehow, while supporting external parameters still might be a feature which is new to metaseo.
Related: #268 potentially (partly) fixes this issue. We need to see if TYPO3 6.2.27+/7.6.11+ still have this issue.
I retried this with 6.2.29, meaning that #268 was applied (including configuration change). With the security-patched versions of TYPO3 I don't get new entries when I request pages with parameters like /?pk_campaign=abc
.
If someone else still is affected by this issue, please
Basically, that means we do well for
For the moment, I think this bug is fixed via the core #268, closing this respectively. If someone still is affected with #268 applied, please open a new issue and reference back to this one.
Please reopen this issue, as this problem already exists for me.
TYPO3 7.6.15 metaseo 2.0.3 realurl 2.1.5 php 5.6.27
cHashIncludePageId
is set to 1
I already have parameters like "gclid", which comes from google adwords in the sitemap.xml. I also tested with other parameters (?parameter=test). Same problem.
What's wrong? Did I miss setting up some options? Or is there still a bug?
@sebastianschrama Thanks for reporting back. The versions you use should be fine.
My test was against TYPO3 6.2.29. Considered that you are still faced with this issue, closing this issue should indeed go along with a test against 7.6.15+.
It's unlikely I can do something before March 2017 (limited time). Would be great if someone could track this issue further down in the meantime.
To answer your questions:
It's a bug if parameters unknown to TYPO3 make it to the sitemap again. If these entries also blow up TYPO3's caches (unlimited), it could even be a security issue in the core (exploitable by arbitrary requests, allows for (D)DOS).
Would be great if you could ensure the correctness of your test case, once again. Just to be sure.
If someone else still is affected by this issue (especially 6.2 users), I'd be glad to hear from you.
Thanks for the quick response. I followed your instructions, but unfortunately the sitemap still adds unknown parameters in urls.
Maybe @benf can help out? @benf I already took into account that your patch set could eventually be useful for this issue.
The cHashIncludePageId
is definitely set to 1
in my case.
And the user here (https://github.com/mblaschke/TYPO3-metaseo/issues/306#issuecomment-272071718) talks about external links and untranslated pages.
I am talking about unknown parameters. :)
I know (in respect to #306). Deleted my previous message long before you answered ;-)
Any news here?
Would be great if someone could provide a patch or track this down at least (=> "help wanted" tag). I still don't have a lot of time currently.
Moderating this issue, I moved this comment to #505.
@thomaszbz
@DanPii That sounds cooluri related. I moved it to a separate ticket #505.
@thomaszbz I think only the URLs containing ADMCMD_cooluri=1 are cooluri relevant.
I am not using typo3 extensions for FB or Google Ads, but facebook javascript on some pages, so Typo3 will not know about these parameters.
Also, I just saw in the documentation that there is a Typoscript Node called: plugin.metaseo.sitemap.index.blacklist Could I use that with regular expressions to put the relevant links on a blacklist?
Thanks Daniel
@DanPii Well, for the case it's not CoolUri related, then it just duplicates this issue.
You can blacklist these entries in the sitemap view of metaseo at least.
Blacklisting with the sitemap view of metaseo only works after they have been already added. Unfortunately there were a lot. Is it possible via Typoscript and this Node "plugin.metaseo.sitemap.index.blacklist" to avoid them being added in the first place?
@DanPii Currently, I'm not aware of a workaround for this issue. If you find a workaround for this issue, please post it here. Maybe we can use it for the fix as well.
any news on the subject ? Issue ist still present with TYPO3 8.7.27 & metaseo 3.0.0. I think this is related to the url-fetching in htdocs/typo3conf/ext/metaseo/Classes/Utility/FrontendUtility.php:193:
$tsfeAnchorPrefix = Typo3GeneralUtility::getIndpEnv('TYPO3_SITE_SCRIPT'); If a client calls the TYPO3 instance without proper useragent etc., the metaseo logic might be circumvented in a way and the GET-Params stay in the URL when added to the sitemap (TYPO3_SITE_SCRIPT might not always be set).
Would it be possible to add a constant where we can set a white- or blacklist for GET parameters? Often there are URLs with tracking parameters like "utm_source".
It would be nice to exclude URLs with such parameters from the sitemap generation.
Or is that already possible? I didn't find any information..
Thank you in advance!