Closed finestructure closed 2 years ago
Might be relevant that it was the iPad, although on he Mac I usually Cmd-L, Cmd-C, so it might not be apparent there even if the same bug exists.
I can reproduce this and it's very strange.
When shared from the iPad, other search sites include the query string so it's not a fundamental problem with iPad sharing.
However, as you say, if you paste in a search https://swiftpackageindex.com/search?query=toml to the address bar on any platform, it renders the full results so I don't think it's anything with JavaScript or anything that's happening after the page starts to load.
I don't even know where to start looking for this problem.
It's been a while since I've dove into this stuff, but I believe Safari uses the canonical link which we set via a meta tag in the header.
<link rel="canonical" href="https://swiftpackageindex.com/search">
😱 What?!
So we need to remove the canonical meta tag for the search page and any other page where a query string is important? Maybe we should only include it on pages we know to be canonical, like package pages and the home page.
Done a bit of StackOverflow diving and looks like my memory was good.
Confirmation from Apple employee: https://twitter.com/rmondello/status/910276001300488192 Confirmation from community: https://developer.apple.com/forums/thread/686701 Random hack to resolve the issue: https://stackoverflow.com/a/58787618/17582231
I was Googling too and this is a good document on how Google uses this (which is why we added it).
https://developers.google.com/search/docs/advanced/crawling/consolidate-duplicate-urls
Random hack to resolve the issue: https://stackoverflow.com/a/58787618/17582231
While this is a bit 🤮 , it would work well and also keep the metadata there for Google. The best alternative would be to remove that meta tag for the search page.
Edit: Reading the comments on that hack, I don't think we should use that. It's easy to forget that Google executes JavaScript and acts like a normal browser these days.
Aye it makes sense to want to have the meta tag there, especially for the search page as to not clutter what Google is tracking.
The comments on that hacky workaround don't seem too positive though given that we know Google waits a small while before parsing the HTML, it's plausible that by the time it's scraped that the script to remove the tag is gone.
The comments on that hacky workaround don't seem too positive though given that we know Google waits a small while before parsing the HTML, it's plausible that by the time it's scraped that the script to remove the tag is gone.
Heh. I edited my reply after reading the comments too 😅
This is really weird behaviour. Safari really shouldn't do this.
Safari really shouldn't do this.
Put that on a T-Shirt. SPI merch.
Is Google really not stripping out query parameters when determining canonical URLs? I don't have any experience with this at all but it struck me that all the duplicate URL examples seem to refer to duplicate locations in the URL part (i.e. before ?
). Just wondering if we'd incur duplicates if we didn't set canonical URL on the search page.
In particular, in the dynamic URLs
section, the examples are
- https://www.example.com/products?category=dresses&color=green
- https://example.com/dresses/cocktail?gclid=ABCD
- https://www.example.com/dresses/green/greendress.html
I wonder if it's significant that they're not
- https://example.com/products?category=dresses&color=green
- https://example.com/products?gclid=ABCD
- https://example.com/products
Not detracting from Sven's point above which is likely very fair, but if we do need to keep it - we could probably move the link to a HTTP header rather than a HTML link.
Untested, but this might not be picked up by Safari and might work as expected. Big ifs, and as per Sven's point, might not even be necessary.
Do you mean to normalise the domain name? That's one use of it and we use it to make sure Google chooses https over http and without the www. Not that we even serve on http, so that's fairly academic.
I think what they're trying to demonstrate there is that you could inform Google that you'd prefer https://www.example.com/products/dresses/green
over https://www.example.com/products?category=dresses&color=green
and that they contain the same content.
We could include the query string in the canonical version. If we do that globally then we could make something that works for all pages (including future pages) without needing to remove the tag.
I feel like actively including the query tag in the canonical link is more dangerous than just not setting it at all.
While we encourage you to use any of these methods, none of them are required. If you don't indicate a canonical URL, we'll identify what we think is the best version or URL.
is more dangerous than just not setting it at all.
Why? The canonical version of a search for TOML is https://swiftpackageindex.com/search?query=toml
We don't use query parameters frivolously. I may be missing something but I can't see any danger in including them.
https://webmasters.stackexchange.com/a/114624
I take back my comment, looks like the recommendation is that if the query parameter changes the content then it should be included. If it's just for analytics (like UTM parameters) then it shouldn't be included.
It's worth noting that if we did this, it wouldn't ever include query parameters that other people included. For example, incoming links from iOS Dev Weekly include UTM parameters:
https://swiftpackageindex.com/pointfreeco/swift-custom-dump?utm_campaign=iOS%2BDev%2BWeekly&utm_medium=web&utm_source=iOS%2BDev%2BWeekly%2BIssue%2B541
We'd never include those parameters, as they're not necessary and not generated by us.
Just for my understanding, if we set the canonical version to include the query parameters (on search only, right?) then it would mean from Google's perspective (and unless they strip query parameters and aggregate on the base URL regardless) these are all distinct pages, leading to a lower ranking of "the bare search URL". I.e. there's no longer a single bucket of "bare search URL" that gets all these hits.
I don't know if that's a bad thing but that's what would happen, right? Is there a scenario where this is a bad thing?
That was my initial thinking Sven, by explicitly adding the query parameters we're telling Google they're all distinct pages which while technically accurate and according to the posts above seems to be 'expected' - it does devalue other pages.
Though to be fair, search doesn't come up at all in Google even if I search for "swift package index search" so not sure it's much of a loss 😂
Though to be fair, search doesn't come up at all in Google even if I search for "swift package index search" so not sure it's much of a loss 😂
Yeah, I'm not sure there are many places out there linking to a search result page let alone suffer from being in their "own bucket" 🤷
and unless they strip query parameters and aggregate on the base URL regardless
They don't. Including query parameters here is a valid thing to do.
I don't know if that's a bad thing but that's what would happen, right? Is there a scenario where this is a bad thing?
I can't think of one. In fact, I can see it as a good thing for two reasons:
/search
, that's a good thing! We don't want to send people there. I'd much rather people land on the home page than on /search
. There are a couple of other things to note here too:
/search
with or without query strings. They don't see when people actually search, only when people include links to searches on a page that Google indexes. It's possible that they do see what people search for through watching Chrome URLs, which is why I'm including this point last. 😬https://swiftpackageindex.com/search?query=toml
and https://swiftpackageindex.com/search?query=TOML
should have the same canonical.~There might be another, related bug here. I'd shared a link to the Swift forums from my iPad and knowing that the search query would be stripped, I manually copied the url from Safari's url field and pasted it into the message. That url encoded the + and :. Karl spotted and fixed it.~
~Not sure if that's just a Safari or Swift forum problem but hopefully fixing the sharing could step around that.~
This is actually a Swift forums issue.
Just got bitten by this again, this time on the phone. This time I had a really weird effect, perhaps due to Safari's tab groups or search history but the pasted url had a different search query parameter. I had searched for openapi
but the pasted search was for composable
.
I searched for “toml” on the iPad, tapped he share icon and selected “Copy link”. The resulting url was
https://swiftpackageindex.com/search
i.e. it dropped the search term and therefore wouldn’t link to the actual results.