Closed c410-f3r closed 7 years ago
Thanks for writing this up - I agree that this is something that we should address in the library. That said, for me it isn't the biggest issue on the table right now. Do you have some examples within the library where you think this is a problem?
I think it is somewhat inevitable in a linear algebra library to have some large functions. And moreover separating the function out just to achieve this goal can make the problem worse as it becomes more difficult to follow the logic (you have to chase functions around). For me the solution is clean, self-documenting (and line-commented) code. Of course, there are some cases where we probably do have some quite bloated functions.
The bigger issue for me, which is definitely related to this, is that we have a few very large modules. I find it hard to find things in them and I wrote the majority of them - I imagine for new contributors it is hell. In particular I'm talking about the impl_ops
and slice
modules. I want to split these out in much the same way that @Andlon sorted out the decomposition
module.
Like @AtheMathmo I believe this is not the biggest issue for now. On the other hand I do not think a linalg is more prone to bigger function than any other library.
I agree that having a concrete example could help.
I haven't come across too many big functions in rulinalg
so far. The exceptions are some of the numerical algorithms, but in these cases the code often mimics an algorithmic description, so it might not be desirable to split it up too much. However, as @AtheMathmo pointed out, some modules are very bloated, and this is something that should definitely be remedied.
Now, I think perhaps you're asking for advice on how to structure your work on #83 rather than a suggestion to make changes in the existing code base. In that case, I will try to give you some (hopefully useful) feedback on your suggestions.
Remember that Rust allows you to have free functions. There's actually no reason to wrap functions in a Struct
for this purpose. For example, you could do (horrible example, sorry. Obviously the pizza should be much more composable):
/// Feel free to document, but if it's a very obvious helper function,
/// this may not be necessary
fn make_dough() { unimplemented!() }
fn make_tomato_sauce() { unimplemented!() }
fn make_favorite_topping() { unimplemented!() }
fn assemble_pizza(crust: Crust, sauce: Sauce, topping: Topping) { unimplemented!() }
impl Dish for Pizza {
fn prepare(oven: &Oven) -> Self {
let dough = make_dough();
let sauce = make_tomato_sauce();
let crust = oven.prebake(dough);
let topping = make_favorite_topping();
let unbaked_pizza = assemble_pizza(crust, sauce, topping);
oven.bake(unbaked_pizza)
}
}
Well, this is closer to what I would do, but it's perhaps a little over-the-top. Let's go back to the Pizza example, but let's assume that we want to accommodate combinations of all kinds of toppings. You could organize it like this:
pizza
├── pizza.rs
├── toppings.rs
└── mod.rs
In the above, mod.rs
only imports the necessary types from pizza.rs
and toppings.rs
. The API of the pizza
module is defined in mod.rs
, but no actual implementation is done here. Rather, implementations are logically grouped depending on where they make the most sense, so that you should be able to find the appropriate types quickly just by looking at the filenames. You wouldn't expect to find anything about "dough" in topping.rs
, would you? So it's probably in pizza.rs
.
Ultimately, the two approaches are not mutually exclusive, and you should probably use both, depending on whatever is more appropriate.
My (personal) guidelines for dividing things into individual (internal) modules can be summarized as follows:
mod.rs
should not contain implementations or definitions, but rather export the user-facing API. You can think of it as "transforming" the internal file organization into the API that the user sees.Especially the third point is not obeyed by many parts of rulinalg
, and this is of course fair since I'm just stating my personal preferences here. That said, I do wish we could move more towards this pattern in the long run.
Thanks everybody, especially @Andlon. I guess that I have enough information to "standardize" my next PRs. I look forward for formal code guidelines but this issue is too specific to such discuss, then @AtheMathmo can close it.
I agree - some additions to the contributions doc would be a good idea!
Aside from a good documentation, a function should also be split if it is too big (±30 lines). Smaller functions are easier to understand, maintain and are also more reusable and self-documented. This issue is related to #83 and I will illustrate some possible approaches. Any suggestions are welcome.
Source code example
First approach
Create smaller static methods within a corresponding struct.
Pros: Simple Cons: Puts everything in the same file
Second approach
A method can have all its logic grouped in a separate module. Would be nice to have a common contract (Trait) that each separate module should implement.
Pros: More intuitive? Cons: Complex, files name must be sync.
Structure
module/mod.rs
There is no silver bullet to this problem but I will be very grateful if we all come to a conclusion about this whole thing.