XAMPPRocky / tokei

Count your code, quickly.
Other
10.88k stars 527 forks source link

Languages::get_statistics() depends on current working directory #811

Open not-my-profile opened 3 years ago

not-my-profile commented 3 years ago

With the following code:

let mut languages = Languages::new();
languages.get_statistics(&["myfolder"], &["/secrets"], &Config::default());

you would expect myfolder/secrets to be excluded, but that's not the case!

Languages::get_statistics uses utils::fs::get_all_files which contains the following line:

https://github.com/XAMPPRocky/tokei/blob/4fd51a006922a3e531099109d1aef3f4c254e95e/src/utils/fs.rs#L31

This results in the absolute ignore pattern being interpreted relative to the current directory. myfolder/secrets is therefore not excluded.

Since I spent way too much time hunting down the bug this pitfall caused in my application, I think it would be a good idea to add a respective warning to the documentation. The pitfall could be fixed completely by introducing an ignore_root: &str argument. I think you should consider this breaking change for your next major version.

XAMPPRocky commented 3 years ago

Thank you for your issue! I'm not going to add a warning since it would be just as much effort to fix the issue. This also doesn't require a breaking change (though the next release will be for other changes) because get_all_files is not in the public API, we can add an optional current_directory field to Config. I also think we can use a simple heuristic that we use either the one set by the config or we use the parent path of where the first path in the ignore list resolves to. If you (or anyone reading this) is interested I would review and accept a PR adding that.

not-my-profile commented 3 years ago

Thanks for your quick reply. I don't think a current_directory Config option is a good idea since currently an immutable config can be reused for multiple get_statistics invocations for different paths and this would be a weird separation of parameters that arguably belong together. I also don't quite get what you mean by "resolve" in "or we use the parent path of where the first path in the ignore list resolves to". I don't think the builder path should depend on the presence or absence of any files.

The problem is that there are multiple scenarios:

So I still think the best solution would be to break compatibility by adding an ignore_root argument to get_statistics, making the involved complexity obvious.