rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.2k stars 12.7k forks source link

Add search for external crates #86715

Open GuillaumeGomez opened 3 years ago

GuillaumeGomez commented 3 years ago

In gtk-rs, a lot of crates are provided and have been split into 3 repositories: core, gtk3 and gtk4. Both gtk3 and gtk4 link to core using --extern-html-root-url, however it doesn't allow to look for items in core from neither of them.

The idea would be to query the search-index.js file from rustdoc front-end. However it brings the potential following issues:

Another option could be to simply generate the search index for those crates and use the --extern-html-root-url option to allow to go to the item from the search result. It would force us to add an option to only generate a search index for a crate and not its documentation though.

Out of the two appraoches, I think the second option is better.

However, do we want such a feature? Would it be useful to enough users to make this development/maintenance "worth it"? What do you think @rust-lang/rustdoc ?

cc @sdroege

Manishearth commented 3 years ago

The idea would be to query the search-index.js file from rustdoc front-end

I don't quite understand what's being proposed in this one: are you proposing we try and find the file on disk somehow?

Another option could be to simply generate the search index for those crates and use the --extern-html-root-url option to allow to go to the item from the search result

I also don't understand what is being proposed here.

GuillaumeGomez commented 3 years ago

Let me try to improve the explanations then:

In case you only document your current crate and not its dependencies, in the doc pages, if you use --extern-html-root-url it'll link to the correct page if you click on a type from one of the dependencies. However, you can't "search" for this type because it's not in the search index. It was brought to my attention that it could be useful to have the possibility to look for a type in a dependency that is available somewhere else.

Does it make more sense? If so I'll update the first comment.

Manishearth commented 3 years ago

Ah, so this is about cases where the entire deptree isn't being documented. This would probably affect docs.rs too (cc @jyn514)

I still don't quite get the solutions being proposed though.

jyn514 commented 3 years ago

cc https://github.com/rust-lang/docs.rs/issues/494

Nemo157 commented 3 years ago

IMO this would have to be at least opt-in, as it would expose implementation details about facade-style crate-trees.

GuillaumeGomez commented 3 years ago

@Nemo157: I think so too. :)

@Manishearth: It's mostly about either we merge search indexes (with a little tweak in the externals URLs) when generating docs and trying to get external search index from a remote location.

Manishearth commented 3 years ago

It's mostly about either we merge search indexes (with a little tweak in the externals URLs) when generating docs and trying to get external search index from a remote location.

Right, I'm unclear as to how you're proposing it

GuillaumeGomez commented 3 years ago

Simply because I'm unclear about it as well. For the first case, I was thinking about adding an option to rustdoc to only generate the search index, but that would require to also use the --extern-html-root-url into account when generating docs for the crate (for the items from the external crates in the search index).

Manishearth commented 3 years ago

Right, you've proposed two solutions for it and haven't picked one yet, but I don't understand what those solutions are.

jnordwick commented 2 years ago

I was the one that made this request originally I think. Please remember that I might not know what I really want. I tend to know what I wont in dev topics I'm familiar with but rust is still a convoluted mess in my head.

This request come from two things: 1- For returned value or anything exposed written in the crate, it can be very difficult and time consuming for someone learning or not already very familiar with rust to find where a random method call is documented of even from (often not even aware a returned type isn't in that project and the calls just seem to come from nowhere. 2- Crates being broken up now make it difficult to look at the API (documenation of all of them at once as if one single crate.

(small aside: a rust source code crossreferencer that links the method being call to the poper definition would help tremendously).

But remember, I'm widly new and wildly don't understand the problems I do come across.

These sort of named but

Manishearth commented 2 years ago

So the thing is, multi-crate search does work already, I think the problem may be in y'alls setup? For example, see https://unicode-org.github.io/icu4x-docs/doc/icu/index.html , you can search any of the in-workspace crates. We generate this with cargo doc at the workspace root.

GuillaumeGomez commented 2 years ago

There is also the case where you have generated docs for the crates in your workspace and not for their dependencies but still want to be able to look through them when running a search. Then when a search result for an "outside" crate shows up, the link will target the website where this doc is.

jnordwick commented 2 years ago

I don't generate documentation locally. I didn't realize people did that or how search and serving the pages happens then. I just use what is online and usually linked from a projects homepage or docs.rs. It odeon't really work on there it seems. For example, if I go to the tungstenite docs: https://docs.rs/tungstenite/0.17.2/tungstenite/?search=write_all and search for write_all nothing shows up. That call is used all over the examples and documentation, but I have no way to find it unless I already know where it is.

On Wed, Apr 27, 2022 at 4:11 AM Manish Goregaokar @.***> wrote:

So the thing is, multi-crate search does work already, I think the problem may be in y'alls setup? For example, see https://unicode-org.github.io/icu4x-docs/doc/icu/index.html , you can search any of the in-workspace crates. We generate this with cargo doc at the workspace root.

— Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/86715#issuecomment-1110760779, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASBHXHG23FL5C2QFHIERVLVHEAFLANCNFSM47QMM3VA . You are receiving this because you commented.Message ID: @.***>

Manishearth commented 2 years ago

Yeah, that's a deliberate design choice in docs.rs, otherwise your search space would be extremely polluted and search would be much slower.

jyn514 commented 2 years ago

Yeah, that's a deliberate design choice in docs.rs, otherwise your search space would be extremely polluted and search would be much slower.

I don't think this is true, it's just that no one has come up with a good way to use the search index from another crate. (I think Guillaume mentioned just concatenating the scripts at some point? but we need some way of doing that that doesn't involve 300 fetches from S3.)

Manishearth commented 2 years ago

I don't think this is true, it's just that no one has come up with a good way to use the search index from another crate. (I think Guillaume mentioned just concatenating the scripts at some point? but we need some way of doing that that doesn't involve 300 fetches from S3.)

I vaguely recall discussions ages ago about this, but I may be misremembering.

Perhaps it should be an option during search, but I really don't think this should be the default.

jsha commented 2 years ago

@jnordwick said:

For example, if I go to the tungstenite docs: https://docs.rs/tungstenite/0.17.2/tungstenite/?search=write_all and search for write_all nothing shows up.

This is a useful example, since you were probably looking for std::io::Write::write_all from the stdlib docs. We've definitely talked before about making docs.rs search also search in stdlib. I can't recall exactly where (Zulip?). But I think it's a great idea, and if we're looking at expanding what's searchable on docs.rs, mixing in stdlib should be our first priority.

your search space would be extremely polluted and search would be much slower.

It's true that including significantly more crates would make search slow. The current search implementation iterates linearly over every item it knows about. The stdlib search is about 100ms on my very fast modern laptop.

It would be really fun / interesting to work on making search dramatically faster, for instance using a trie and a more efficient storage format (to speed up download time). A big project! And there's a lot of built-up functionality in current search, and fuzzy matching that would be hard to make more efficient.

no one has come up with a good way to use the search index from another crate

I think we could just load the search-index.js (with a little tweaking to search.js). As you mention, if we did this on an individual basis it would be a lot of fetches, but that's surmountable. I think a bigger issue is that it would require us to stabilize the search-index.js format. It's been de facto stable for a long time, but it's nice to have the flexibility to change it.

Another option would be for docs.rs to do what local cargo doc runs do, and build all of a crate's dependencies into a single big index. I'm guessing that's way too expensive though.

jyn514 commented 2 years ago

Another option would be for docs.rs to do what local cargo doc runs do, and build all of a crate's dependencies into a single big index. I'm guessing that's way too expensive though.

If rustdoc just built the search index for dependencies, that would probably not be too bad. But currently, it's inseparable from "generate all documentation for this crate", which would be much more expensive (just from io!) and use up a lot of our instance storage on temporary files that immediately get deleted.

jnordwick commented 2 years ago

This is a useful example, since you were probably looking for std::io::Write::write_all from the stdlib docs.

Well, I was looking for it because the example I was looking at called it. It has been a while, but I think it might have been on an async-related trait was implicit, and I didn't know where to look.

This whole issue is because lexical analysis of code by humans isn't really possible in rust, and an IDE doesn't always help (having to use an IDE to get a list of methods you can call on a value is pretty lame). Rust is a complex, big language and still growing rapidly. The doc and info tools is stuck in a sort of simplistic mindset as if it was operating on C code. It would make the language much easier to learn if there was a way to list all methods of a value or all types it could be converted to, I don't think it is possible to do either of these.

GuillaumeGomez commented 2 years ago

The idea I had was to load search-index.js from other crates to enable search in dependencies. It would require to add a new rustdoc parameter (or attribute) to give a relative path to let the JS know where to look it up.

Nemo157 commented 2 years ago

There's still also the question of which dependencies need to be loaded. One option would be all (recursively) public dependencies (once public/private deps are stabilized...); that way you have all the types which might potentially appear in the public API available. For the given tungstenite usecase, that was based on code from the examples, which would require loading all dev dependencies as commonly examples will need to use additional helpers that aren't part of the standard deps.

jnordwick commented 2 years ago

When looking through some code, and you see a method and need to look it up to see what it does or how it is used, but there isn't really a definitive way of getting a reliable answer, it becomes frustrating and a huge time suck. It knocks you out of your concentration and makes the language very unpleasant.

You can search through all the documentation for the method name, but you can often find a couple potential matches. Or not find anything because you don't know which package documentation to search through. You can try and install the package and dependencies and see what the IDE says, but that is a lot of work and something even that is difficult (someone the other day couldn't even run the examples in one package I was looking at - he created an issue because there was no directions on the cargo file incantations needed to get the correct compilation set and dependencies downloaded. (This seems to be the norm rather than the exception for rust packages right now).

I know most of the Rust fan base doesn't seem to think this is a problem, but from everyone I know trying to learn the language - There are dozens of us... DOZENS! - this is often a yield point to swich tasks.

The Rust documentation reminds me of the CMake docs. It you know CMake, it's great, but then you don't need it very much. If you don't know it, you don't even know where to look and even when you do, it doesn't really answer the question you had.

Manishearth commented 2 years ago

It sounds like this is an indicator that perhaps trait methods should be indexed when implemented.

Probably deprioritized somehow, perhaps as a single entry in the search that takes you to an "implementors in this crate" page.