Allow overriding cache directory through config

dbrgn commented 2 years ago

Based on #212. Thanks @hgaiser for the initial work!

Config: Add cache_dir to [directories] section
Deprecate old TEALDEER_CACHE_DIR env variable
Introduce path source for custom_pages_dir as well
Add integration test
Update docs

Implementation notes:

Logic to handle legacy env var, config override and default cache dir selection is now in the Config::try_from(RawConfig) impl
Logic to determine custom_pages_dir is moved into Config::try_from(RawConfig) as well for consistency

The logic to determine which cache directory is used is now in config.rs, when converting the RawConfig to Config. I think this is the place where the config file and external factors (defaults, env vars, etc) can be taken into account to determine a final config.

dbrgn commented 2 years ago

@niklasmohrin sorry for missing #266, I wanted to get PRs unblocked starting at the oldest, and didn't yet see that you already submitted a proposed solution (I skimmed over it since it was still marked as draft)... 🙁

I think I chose a different approach than you did. I determine the final path in Config::try_from(RawConfig). While implementing, I felt that this seems to be a clean solution, because once you get a valid config, you have a final non-nullable config path.

Validation of that path is being done by the cache itself: If the cache is instantiated with a path, it tries to ensure that the path exists. If it doesn't, it is created. If creation fails, the program is aborted.

I'd be interested in your opinion on that.

dbrgn commented 2 years ago

@niklasmohrin everything should be updated and all your feedback should be addressed!

dbrgn commented 2 years ago

Nice! Some more comments, a handful of them are suggestions about removing code comments - I think that the code following those comments is just as expressive (if not more) than the comment while being more precise. I think the comments here add noise and don't provide any benefit. (and, also, all the textbook arguments like "comments aren't run in tests", "comments duplicate knowledge", ...)

There are more such comments around the codebase (and I would, in most cases, like them gone; not today, but with every little refactor :D). If I recall correctly, we disagreed over this before, haven't we? What did we settle on back then?

Hehe, I know that discussion from other projects 🙂 It comes down to a matter of style, but I think there are good reasons for it. I really like to structure code into blocks that do some kind of action. For example, when creating a file in a directory:

fn myfunction() {
    check_if_dir_exists();
    check_if_dir_is_writable();
    write_file();
    set_file_permissions();
}

In the example above, the three function calls are only 1 line each and self-explanatory. However, when code get bigger, sometimes you need to fully read through the block to understand what it does. With trivial code, that's simple. But if it's a bit more complex, it takes longer.

So I follow a code style where these blocks (potentially spanning multiple lines of code) are preceded with a line comment that explains what the following code does. For example:

fn myfunction() {
    // Ensure that we can write the directory
    check_if_dir_exists();
    check_if_dir_is_writable();

    // Directory is writable! Actually write the file
    write_file();
    set_file_permissions();
}

What I like about this style is that it's really simple to skim over an implementation by just looking at the block headers. Sometimes, blocks are small and self-explanatory, but I usually add these comments anyways for consistency. The idea is not to just state what the next line is doing, but to guide the reader through the code, and - if it makes sense - to also explain the reason why certain things are the way they are.

As a concrete example. This code:

screenshot-20220928-000532

...can be skimmed over easily, especially with syntax highlighting. To my mind, I read it like this:

screenshot-20220928-000649

...which, I would argue, is much easier to quickly understand than this:

screenshot-20220928-000729

Note that the last example is not unreadable. I think it's fairly easy to read. But it takes more time and brainpower to process than the version where each block is summarized.

I do kind of feel strongly about this, so I would not want to take away comments, as long as this "block-summary-comment" style is consistent within a function. I don't expect contributors to do this as well, of course, as long as functions remain small and easy to read. (Once functions do get more complex, I think it's valuable to either split them up into multiple functions, or to use comments in a more structured way.)

niklasmohrin commented 2 years ago

If you say you have a strong opinion about this, I can live with it, although my opinion isn't changed. Here is how I see it:

Firstly, I can see that this is useful for reading complex functions, but to me it's unclear where to draw the line. You agreed at the "style config" comment, but I think that

        // Extract archive into pages dir
        archive
            .extract(&self.pages_dir())
            .context("Could not unpack compressed data")?;

and this snippet from main:

// Initialize logger
init_log();

// Parse arguments
let mut args = Args::parse();

look "just as bad" to me. (tbh, I think that all of Cache::update is fine without these heading comments).

I find that having a block summarized as just exactly what is already said in the block is just more noisy, so I would first change to something like this: grafik

Now, I too find this a bit confusing. For example, after the renamed args, we have another if args.something but that does logic rather than cleanup and one could think that it still belongs to the block before because there is no new block heading (so the absence of a heading confused me). While adding the heading back would help with that slip-up, I think it is not the correct approach. What really happened is that our main function just doesn't stick to one level of abstraction and we are trying to patch this up by bundling code into blocks which are then on the same abstraction level. For simple code that is correctly abstracted, this leads to /* init log */ init_log();. So what we should aim for is: grafik

I guess what I am trying to say is: I can understand the "summarizing the block" part, but I don't fully agree with the "make usage of block comments consistent within function". To me, it is a symptom of violating the single-level-of-abstraction rule. And yes, it is somewhat eased with helping comments, but in reality we should just try harder to find better words for what the code should do. And maybe, these comments are just a marker for a refactor later, because we are trying to be productive now. But that's it for me.

Again, at the end of the day we want to get a feature shipped here and I wouldn't want to have this blocked over a fight about comments (they literally do nothing :D). I am curious how you feel about this though

dbrgn commented 2 years ago

Great, thanks for the review. All issues should have been addressed.

Regarding the comments: Yes, absolutely, often the clarity of the code can be improved with refactoring or better naming. I'm not saying that shouldn't be done, on the contrary 🙂 And if a method is simple enough, then comments aren't needed. Once it crosses a certain complexity, I simply like adding block comments (in the style of headlines) consistently. Even if this means that sometimes very short blocks have a comment as well. But it looks more consistent.

Any other last feedback? I plan to merge this and prepare the 1.5.1 release by the end of the week.

dbrgn commented 2 years ago

🎉 Thanks! All feedback should now be addressed.

tealdeer-rs / tealdeer

Allow overriding cache directory through config #276