Closed CaptainCodeman closed 10 months ago
The latest updates on your projects. Learn more about Vercel for Git ↗︎
Name | Status | Preview | Comments | Updated (UTC) |
---|---|---|---|---|
hn | ✅ Ready (Inspect) | Visit Preview | Aug 18, 2023 3:20pm | |
repl | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Aug 18, 2023 3:20pm |
I probably would have implemented this by adding a robots.txt
in the static/
directory. It will save a few bytes of traffic for all our real users and also increase our Lighthouse score, which I believe checks if you have defined a robots.txt
. Since this is an example site, it's probably worth trying to be a little extra pedantic in following the practices we'd like to promote
I'm curious, what searches is hn.svelte.dev showing up for where we'd like it to be excluded?
The problem with using robots.txt is that it blocks the search engines from being told to remove entries. For a new site it's correct but you need to allow the spiders access to learn that the entries need to be removed (but it could be added after they are)
The example given was for "svelte fuzzing", which for me also brings up a totally non-svelte related page from the hn site:
Looks like there are about 8,250 pages indexed from it.
The problem with using robots.txt is that it blocks the search engines from being told to remove entries.
wtf! well that explains why I haven't gotten svelte.dev/tutorial out of the search results yet (in favor of learn.svelte.dev). I just requested to remove it in the google search console, but I suppose I'd need to do the same for bing, etc. so this is probably still the better solution in that case so that the other search engines are handled
As mentioned in discord, having so many non-svelte results show up when searching for svelte related content can negatively impact people's ability to find the content they are looking for