BurntSushi / memchr

Optimized string search routines for Rust.
The Unlicense
873 stars 99 forks source link

add documentation for internal code structure and organization #150

Open betelgeuse opened 5 months ago

betelgeuse commented 5 months ago

Using VS Code I was browsing to the definitions of memchr for AArch64. I expected to find everything platform specific in src/arc/aarch64. After some head head scratching I figured that the intrinsic calls are actually in src/vector.rs mixed with other platforms. Would it be feasible to split the vector.rs content to src/arch sub folders?

BurntSushi commented 5 months ago

No, I put them all in the same file intentionally so that everything related to the Vector abstraction is co-located. This way, you can easily see all implementations of the trait in one place.

I'm also not clear why it was hard to find the definitions? Shouldn't goto-definition have taken you right to where they are defined?

betelgeuse commented 5 months ago

I'm also not clear why it was hard to find the definitions? Shouldn't goto-definition have taken you right to where they are defined?

My VS Code journey looks like this for memrchr using Go to Definition:

  1. F12 for memrchr --> src/memchr.rs
  2. F12 for memrchr_raw --> src/memchr.rs
  3. F12 for crate::arch::aarch64::memchr::memrchr_raw --> src/arch/aarch64/memchr.rs
  4. F12 for rfind_raw --> src/arch/aarch64/memchr.rs
  5. F12 for rfind_raw_impl --> src/arch/aarch64/memchr.rs
  6. F12 for rfind_raw --> src/generic/memchr.rs

    At this point I was confused on why I ended up in the generic implementation and thought something went wrong in the search. Understanding now that the call path does really go through the generic function, I am able to find my way to Vector code with Go to Definition.

Feel free to close this as a newbie note if you feel there's no gain. Basically the point is that for someone new to the project having some platform specific code co-located and some in their sub folder was confusing.

BurntSushi commented 5 months ago

Yes, I'm not a fan of the genericness of the internals. Before the refactor to support aarch64, I went with the "just duplicate the code" approach. This was fine when it was just a couple duplications, but adding aarch64 (and also wasm32) made that approach very painful. Enough so that I thought it was worth obfuscating the code with genericness.

I think the key misjump you had was assuming that "generic" meant "platform independent." In reality, the "generic" in this context is more like "a template with which platform specific types are substituted during the monomorphization phase of compilation."

I'd be open to adding an ARCHITECTURE.md document.