XAMPPRocky / tokei

Count your code, quickly.
Other
10.56k stars 507 forks source link

Add an option to map an extension to a language #67

Open xpayn opened 7 years ago

xpayn commented 7 years ago

It would be very convenient to have such an option when an extension is unknown for a given language or when an extension is ambiguous (e.g. .cgi, .inc). It could be used to override a default mapping, or even discard a mapping, for instance I have a file .pro which is a QT Creator project and not a Prolog file. But maybe in the later case, it would be cleaner to have a dedicated option to ignore a given extension. I'll try to submit a PR, but any guidance would be greatly appreciated :)

xpayn commented 7 years ago

In order to check if a language exists, I thought about impl<'a> From<&'a str> for LanguageType, but from isn't supposed to fail... I thought about implementing TryFrom, but it's tagged as unstable. What would be a reasonable solution ?

XAMPPRocky commented 7 years ago

I have thought about this feature for a few months. I don't think it should be through flags. Then you'd always have to remember to have the set the correct flags every time. Instead I think it should be like .gitignore in that tokei reads that and it applies recursively throughout, unless there is a .tokeirc in a subfolder in which case it applies to that subfolder and all subfolders in that, and so on and so forth.

xpayn commented 7 years ago

I agree that a rc file would be really nice but being able to use this feature on the cli can be useful when you want something quick. For a recurring use of tokei it's definitely not enough. As support for a rc file isn't available yet, I think it won't hurt to have something like

tokei --map foo=Perl -m bar=Lisp

What do you think?

XAMPPRocky commented 7 years ago

@xpayn Well I think this should be implemented on the road to to rc. As I don't want to make an implementation that will have to be rewritten soon. As rc is the next feature I want to implement.

xpayn commented 7 years ago

ok, seems fair to me. i'll try to keep an eye on the project and help if i can.

XAMPPRocky commented 7 years ago

@xpayn Please do. And of course if there are any other problems or feature requests please do make an issue for them too!

xpayn commented 7 years ago

@Aaronepower Maybe you can have a look at what @BurntSushi did for managing extensions an file types in ripgrep: https://github.com/BurntSushi/ripgrep/blob/master/src/types.rs He already handles the mapping rg --type-add 'foo:*.foo,*.foobar' Maybe a common crate for both projects, could be considered

BurntSushi commented 7 years ago

FYI, I'm working on splitting out the ignore/gitignore/filetype logic from ripgrep into a separate crate. It should be done in the next couple weeks.

remexre commented 7 years ago

A heads-up for contributors, looks like the .*ignore crate split for ripgrep got finished a while back:

https://crates.io/crates/ignore https://github.com/BurntSushi/ripgrep/tree/master/ignore

XAMPPRocky commented 7 years ago

@remexre Tokei has already integrated the ignore crate. The filetype functionality isn't being used yet however. I'm still trying to figure out the best solution.

remexre commented 7 years ago

Forgive my ignorance, but what exactly is the problem that needs to be solved? ignore's types module vs tokei's? Or is there something else?

XAMPPRocky commented 7 years ago

@remexre Having the same functionality as .gitignore, but as it relates to mapping extensions against languages.

remexre commented 7 years ago

Doesn't ignore have that functionality in https://docs.rs/ignore/0.1.5/ignore/types/index.html ?

XAMPPRocky commented 7 years ago

@remexre ignores's types aren't 1 to 1 with tokei's. Those are for ignoring file types, where tokei wants to map them to a different language.

BurntSushi commented 7 years ago

Note that you don't need to use the built in file types.

On Dec 11, 2016 5:17 AM, "Aaron Power" notifications@github.com wrote:

@remexre https://github.com/remexre ignores's types aren't 1 to 1 with tokei's. Those are for ignoring file types, where tokei wants to map them to a different language.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Aaronepower/tokei/issues/67#issuecomment-266273870, or mute the thread https://github.com/notifications/unsubscribe-auth/AAb34kZeRBHqbmq-LyqKLWgoOE9U6x-vks5rG823gaJpZM4KCntQ .

ghost commented 7 years ago

A modeline (vim) or file-local-vars (emacs) at the beginning or end of a file are also often used to enable a certain language mode and can be used for better accuracy. There's also the problem that languages have conventions for certain files which are not merely the extension but the whole filename. For example, in Erlang you have <app_name>.app, sys.config, <app_name>.rel, etc. which (with or without a modeline) can be considered to be Erlang syntax by mapping filename patterns to languages in .tokeirc. Of course, if one edits such files regularly, it's either mapped in your shared editor config or via a modeline in the file, so we cannot assume either one or the other, and thus both .tokeirc and the possibility to look for ex: ft=rust or -*- mode:rust -*- would be useful in order to avoid the need to set up projects for tokei.

XAMPPRocky commented 7 years ago

@tuncer I'm not really familiar with modeline, or file-local-vars, could you provide a few examples?

ghost commented 7 years ago

The relevance for tokei is that it's common to select a filetype (vim) or mode (emacs) via a file local var and modeline in files that do not have a uniquely mappable file extension (main.c) or unique name (Makefile). This can be used to detect a file's type and hence Tokei language.

Armavica commented 5 years ago

Would it be an option to apply heuristics on the files with ambiguous extensions to try and guess the language?

XAMPPRocky commented 5 years ago

I'm going to close this issue, moving everything to #195 as I don't think I'm going to implement a solution that isn't a configuration file.

XAMPPRocky commented 5 years ago

This feature is now available in Tokei 9.0

LunarLambda commented 4 months ago

What about files with no extension? tokei completely ignores those currently.

For example my zsh configuration is made out of modular files with no extension (example: rc.d/aliases). Currently tokei returns 0 lines for the entire directory, including the plain-text README.

Would it be possible to specify a "fallback" language used if the language can't be determined, or be able to map full glob patterns or similar to languages, rather than only extensions?

XAMPPRocky commented 4 months ago

What about files with no extension? tokei completely ignores those currently.

It doesn't, tokei just currently requires a well-known file name before it will count them, you can see this for dockerfiles for example.

https://github.com/XAMPPRocky/tokei/blob/5f755db185937b33f38a30e7f6facfa5be0f146a/languages.json#L351