Closed ottumm closed 2 years ago
Good hack for BASH_SHELLCHECK_FILE_NAMES_REGEX
, I didn't think of that :)
Linting files with no extension by default is indeed a use case that should be managed by MegaLinter
I think that for performance reasons we may need to add a variable LINT_FILES_WITHOUT_EXTENSIONS (false by default), to avoid to have to read all files every time, whereas it may be not necessary for most repos. But I'm not sure of the cases where we would have a repo with a lot of files without extensions that must not be linted... do you ? Please share your opinion :)
The architecture update could be the following:
Add a property file_shebangs
in descriptors
Example in bash descriptor:
file_shebangs:
- #!/usr/bin/bash
- #!/usr/bin/env bash
MegaLinter.compute_file_extensions : add '' extension if at least one active linter has property file_shebangs
(utils.filter_files should keep files without extensions in that case)
Linter.collect_files : Manage new property files_shebangs to send to utils.filter_files
update utils.filter_files to manage files_shebangs (to test only when the file has no extension)
The second solution would be easier to develop ,but forcing users to declare all the files they want to lint would be against MegaLinter philosophy to be the more "plug & play" possible, so let's go for your first proposition, your future PR is very welcomed :)
Awesome, I'll take this on — feel free to assign me to this issue if that's helpful for tracking.
I'll also open a separate issue for the generalized language settings (e.g. BASH_FILTER_REGEX_EXCLUDE
), because I think that would still be useful for other things.
I think that for performance reasons we may need to add a variable LINT_FILES_WITHOUT_EXTENSIONS (false by default), to avoid to have to read all files every time, whereas it may be not necessary for most repos. But I'm not sure of the cases where we would have a repo with a lot of files without extensions that must not be linted... do you ?
Yeah, this is an important consideration for sure. I think you're right that the safest thing would be to have shebang detection disabled by default for performance, but on the other hand enabling by default makes MegaLinter more plug & play.
I'll do some (probably light) performance testing to try to determine the marginal performance cost of this feature.
I like your proposed architecture, and I think I'll go with that, since it seems like it fits in well with how MegaLinter does things now.
An alternative I've been considering is to use libmagic to detect the file type. The advantage of this is that it's a library that already knows how to scan header info and determine file type from it — and I think it's been fairly optimized. However, I'm not sure if that would fit well into the existing MegaLinter architecture. I can investigate, but if you have thoughts I'd love to hear them. =)
issue officially assigned, thanks a lot for your work :)
LINT_FILES_WITHOUT_EXTENSIONS: Don't bother, you're right the performance cost will be marginal, so let's go for default value true
, and the few cases where there would be lots of files without extensions that must not be linted will be able to set it to false ^^ . As it is a common variable not related to a language or a linter, please add the property in the README and in the json schema :)
about libmagic, why not, but I think it would have lots of impacts about the architecture which relies a lot on file extensions, so I'm not sure the risk is worth taking ^^ ( but why not in another PR maybe ? )
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
If you think this issue should stay open, please remove the O: stale 🤖
label or comment on the issue.
I’ve had to put this aside but I’m still going to work on it!
Thats's great news, i'll be happy to validate the PR :) No stress, we all have priorities on our rent-paying jobs 😅
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
If you think this issue should stay open, please remove the O: stale 🤖
label or comment on the issue.
Not stale ? :)
I mean, it’s getting embarrassing but I’m still into it. I’ll find time.
Don't worry it's not embarrassing ;)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
If you think this issue should stay open, please remove the O: stale 🤖
label or comment on the issue.
Not stale ? ^^
One other suggestion, in case you aren't already committed— Pre-Commit maintains and uses Identify for its primary file identification, which identifies file formats by both extension and shebang, as well as tags it with other things like file/directory, text/binary, symlink, etc.
EDIT: That pun was completely unintentional, I swear
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
If you think this issue should stay open, please remove the O: stale 🤖
label or comment on the issue.
heyyyy not stale ? :) @ottumm ? 👼
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
If you think this issue should stay open, please remove the O: stale 🤖
label or comment on the issue.
Is your feature request related to a problem? Please describe.
I find it tedious to have to specify e.g.
BASH_SHELLCHECK_FILE_NAMES_REGEX
for files lacking an extension. At the same time, it would (obviously) be incorrect to assume that every file lacking an extension is a Bash script.Describe the solution you'd like
I would like mega-linter to read the first line of every file lacking an extension that does not already match a rule specifying the file type, and infer the file type by examining the
#!
value, if it exists.Describe alternatives you've considered
Alternatively (or additionally!), I would like to specify one
BASH_FILE_NAMES_REGEX
(orPYTHON_FILE_NAMES_REGEX
, etc) config value, rather than having to repeat the regex for every Bash linter (BASH_EXEC_FILE_NAMES_REGEX
,BASH_SHELLCHECK_FILE_NAMES_REGEX
, etc).This would be helpful especially because I wouldn't have to remember all the linters for each file type, and it would apply if new linters are added in the future.
I think this would be useful regardless of whether my original feature proposal is accepted or not, so I'm happy to open a separate issue for it.
Additional context
If this sounds like a good idea, I'm happy to work on it.