mrdoob / three.js

JavaScript 3D Library.
https://threejs.org/
MIT License
101.79k stars 35.31k forks source link

Docs: Facilitate indexing of the docs by certain search engines #19031

Closed finnbear closed 4 years ago

finnbear commented 4 years ago

Description of the problem

When I want to look at the docs for a particular Three.js class, I type 'three.js [classname]` into my search engine of choice (DuckDuckGo). The only relevant result is the docs, but not the specific docs page. image I then have to click the docs and search for the class again.

Feature request

If possible, facilitate indexing of all three.js docs pages. This may mean switching from '/docs/#api/etc' to '/docs/api/etc' urls, which could replace or forward to the existing pages.

Search engine

Mugen87 commented 4 years ago

Well, yes. When using my search engine of choice, it looks as expected:

image

When clicking on the search result, you directly land on the Matrix4 page.

Sorry, I clearly don't vote to make the proposed change and break all existing URLs just because search engines can't do their job right.

donmccurdy commented 4 years ago

Does DuckDuckGo support hashbangs, or can we not use the URL fragment at all?

This isn't necessarily a large change, it might just mean adding a simple redirect.

Mugen87 commented 4 years ago

This isn't necessarily a large change, it might just mean adding a simple redirect.

How would this look like? Can you configure redirects with GitHub pages?

donmccurdy commented 4 years ago

It used to be the case that #! was recommended with Google Search, and # basically wasn't supported. Redirecting from one to the other is just a JavaScript change.

Now it doesn't matter with Google Search. I don't know about DuckDuckGo. But you're right, if the hashbang thing doesn't work and we need to use real /docs/foo paths, we can't do this on GitHub Pages.

finnbear commented 4 years ago

Something I probably should have mentioned is that you already have URLs without the '#' that can be found with a slightly longer search term. image

However, their title of "[name]" makes them difficult to see at first glance. image

If you just fixed these pages to have the proper title, maybe duck duck go and the other search engines I mentioned would index them more readily.

This is a related issue that may explain the mentioned pages: https://github.com/mrdoob/three.js/issues/10937

mrdoob commented 4 years ago

So their crawler doesn't support javascript?

finnbear commented 4 years ago

I can't speak to what DuckDuckGo or any other engine does internally, and I'm not suggesting you break any links or do the impossible. It seems like it would be possible to fix the missing [name]. https://github.com/mrdoob/three.js/blob/3c13d929f8d9a02c89f010a487e73ff0e57437c4/docs/api/en/math/Matrix4.html#L11

@mrdoob If you would be open to a PR, I would be willing to go through and add a to each html file or replace the [name] (which is currently templated in by <a href="https://github.com/mrdoob/three.js/blob/dev/docs/page.js#L59">https://github.com/mrdoob/three.js/blob/dev/docs/page.js#L59</a>)</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/Mugen87"><img src="https://avatars.githubusercontent.com/u/12612165?v=4" />Mugen87</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>Have you considered to report this issue to DuckDuckGo first? Fixing the issue on their side seems more appropriate. It then works for all websites using the current docs approach.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/mrdoob"><img src="https://avatars.githubusercontent.com/u/97088?v=4" />mrdoob</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <blockquote> <p>@mrdoob If you would be open to a PR, I would be willing to go through and add a <title> to each html file</p> </blockquote> <p>Sure! As long as you are willing to continue updating all the relevant files for the rest of your life 😁</p> <p>More seriously, this is an issue with DuckDuckGo's crawler.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/finnbear"><img src="https://avatars.githubusercontent.com/u/20015102?v=4" />finnbear</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>What was I thinking? There is a much easier, standards-compliant way to fix this issue: adding a sitemap. Please review #19037 :smiley: </p> <p>(And yes, I contacted DuckDuckGo. Their automated system said they had a high volume of messages and would take time to respond. And this is a problem with the other search engines I mentioned, not just DuckDuckGo)</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/looeee"><img src="https://avatars.githubusercontent.com/u/5307958?v=4" />looeee</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>It's the same on Bing: </p> <p><img src="https://user-images.githubusercontent.com/5307958/78518756-8fb14200-77eb-11ea-9368-4fea7225a045.png" alt="bing" /></p> <p>Maybe DDG uses Bing?</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/finnbear"><img src="https://avatars.githubusercontent.com/u/20015102?v=4" />finnbear</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <blockquote> <p>Maybe DDG uses Bing?</p> </blockquote> <p>Well, technically yes.</p> <blockquote> <p>DuckDuckGo's results are a compilation of "over 400" sources, including Yahoo! Search BOSS, Wolfram Alpha, Bing, Yandex, its own web crawler (the DuckDuckBot) and others.</p> </blockquote> <p>From: <a href="https://en.wikipedia.org/wiki/DuckDuckGo">https://en.wikipedia.org/wiki/DuckDuckGo</a></p> <p>I don't think either Bing or DuckDuckGo is to blame though. By using google analytics, you may be giving Google an advantage when it comes to indexing the page as the user sees it :wink: </p> <p>Having a proper sitemap is a good first step for any search engine, hence the PR. By having the sitemap point to the urls with the '#', and thus the proper title, instead of the raw html pages, the [name] issue may be fixed.</p> <p>In the unlikely event that it doesn't fix the [name] issue, some sort of <title> would have to be baked into the HTML. I'm still willing to do that, unless the resulting PR would be rejected if it lacked a mechanism to keep the titles 100% up to date (by contrast, this PR includes code to autogenerate the sitemap)</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/Mugen87"><img src="https://avatars.githubusercontent.com/u/12612165?v=4" />Mugen87</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <blockquote> <p>More seriously, this is an issue with DuckDuckGo's crawler.</p> </blockquote> <p>Yes, I don't think a PR makes sense in this context. Hence, a sitemap is also not necessary.</p> <p>Again, this needs to be fixed on search-engine website.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/Mugen87"><img src="https://avatars.githubusercontent.com/u/12612165?v=4" />Mugen87</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>Closing, see <a href="https://github.com/mrdoob/three.js/issues/19031#issuecomment-609000135">https://github.com/mrdoob/three.js/issues/19031#issuecomment-609000135</a>.</p> </div> </div> <div class="page-bar-simple"> </div> <div class="footer"> <ul class="body"> <li>© <script> document.write(new Date().getFullYear()) </script> Githubissues.</li> <li>Githubissues is a development platform for aggregating issues.</li> </ul> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.5.1/dist/jquery.min.js"></script> <script src="/githubissues/assets/js.js"></script> <script src="/githubissues/assets/markdown.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/highlight.min.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/languages/go.min.js"></script> <script> hljs.highlightAll(); </script> </body> </html>