Open rarkins opened 3 years ago
this is very hard to change without new properties.
But my suggestion is to do normal glob and use regex when string is like a full regex /..../
. I think a string starting and ending with /
is not a meaningful glob for renovate.
I thought about the /.../
idea too but it would need to be consistent, meaning fileMatch
adopt it too. That introduces even more complexity :)
Another spin on the question: if we deprecate glob support, is it a problem?
ignorePaths
: array, glob-onlyUsed for excluding file paths before fileMatch
. Usually used at top level but can also be used within manager blocks. In theory this could be combined with fileMatch
concepts if we had a first class concept of "and not".
includePaths
, array, glob-onlyInverse of ignorePaths
- can be used at global level or within manager objects.
fileMatch
: array, regex-only, no /.../
Used for matching manager files, so valid within manager objects only. This happens after ignorePaths
and includePaths
.
matchFiles
, array, exact string match onlyWithin packageRules
only. Has some overlap with matchPaths
due to the fact matchPaths
was originally implemented with partial string match, which is being removed.
matchPaths
, array, glob (supported) or partial string match (deprecated)Within packageRules
only.
fileFilters
, array, glob-onlyUsed within postUpgradeTasks
only to filter which files to commit.
excludeCommitPaths
, array, glob-onlyTop-level, but also can be in package rules. Maybe this should be combined with the ignorePaths
concept?
We could support glob or regex for all fields as long as we have an intuitive way to distinguish between then. e.g. if it starts and ends with /
(e.g. /^abc/
) then we know it's regex, otherwise treat it as a glob.
We could also support negative regex matching via a !/
prefix, e.g. !/^abc/
.
A challenge is how to handle OR and AND logic when we use a single array. For example if everything in the array is OR then ["/^abc", "!/^abc/"]
matches everything. We could:
&
prefix, such as ["/^abc", "&!/beta/"]
, which would mean "starts with abc
and does not contain the string beta
I think we can use this &
prefix for both glob and regex with the logic "all non anded patterns are first OR'd together, and then collectively AND'd together with each & pattern. e.g. in regex
["/^abc/", "&!/\.js/"
would match abc.css
but not abc.js
or def.css
["&/^abc/", "&!/\.js/"
would be the same as above["&/^abc/", "&!**/*.js"
would be the same as above too (using glob negative syntax)["/^abc/", "/^def", "&!/\.js/"
would match abc.css
and match def.css
but not abc.js
This negation syntax would allow us to consolidate includePaths
and ignorePaths
into a single setting, e.g. extractPaths
or fileListFilter
(the former seems more self-explanatory if it's only used during extract)
Could we combine fileFilters
and excludeCommitPaths
into one, e.g. commitPaths
and default it to **/*
?
Perhaps a problem is with this edge case:
Could we still support this intuitively if the repo commitPaths
is **/*
while while the postUpgradeTasks.commitPaths
is !X
?
Possible consolidations:
extractPaths
: top-level only, applies to all package managersfilePatterns
: replaces fileMatch
todaycommitPaths
: replaces fileFilters
and excludeCommitPaths
Another possibility: files must match every negated filter and any regular one. This would avoid us needing many &!
prefixes for ignorePaths
migration
I think this will quickly becoming hard to read.
WDYT about:
[["/^abc/", "test/**"], "!/\.js/"]
--> ( "/^abc/" OR "test/**" ) AND "!/.js/"
or here a more verbose alternative: Keep the same basic behavior like we have right now but allow to define objects:
[
{
operation: "AND",
elements: [
"!/\.js/" ,
{
operation: "OR",
elements: ["/^abc/", "test/**"]
}
]
}
]
None of these seem very readable to me either.
We should definitely optimize for the normal case of match X or match Y or match Z. Handling "not match" and AND logic is much harder. I'm still interested in the idea of making each !
to be AND'd
FWIW:
I for one see little use of implementing an expression syntax in JSON (unless a standard one can be reused). If the criteria are that complicated and necessary, maybe we need to implement a dedicated manager? Or provide some scripting interface, which I'd be reluctant to use as well. I'd probably just refactor the repo to get a grokable config.
FWIW:
- Using slashes to distinguish globs from regex means no absolute paths in globs. (Not sure if that's a problem.)
Fortunately we don't have a need for absolute paths because these filters are always based on relative paths/files within the repo.
- If a string parses both as glob and as regex, are the respective semantics the same? (Gut feeling: yes.) If so, you could just let the parsers decide.
No, e.g. the string "abc" in glob means "only the file abc
" while in regex that would mean "any file containing the string abc
".
For this reason we want to make it so that users have to "signal intent" of which they want.
I for one see little use of implementing an expression syntax in JSON (unless a standard one can be reused). If the criteria are that complicated and necessary, maybe we need to implement a dedicated manager? Or provide some scripting interface, which I'd be reluctant to use as well. I'd probably just refactor the repo to get a grokable config.
The challenge we have is include vs exclude semantics. By default at the repo level, we include every file. It would be more comment for people to exclude certain paths than for them to say "only include these paths". On the other hand, at the package manager level, we exclude every file by default and then define just certain patterns we are interested in. Generally you want to OR your includes and AND your excludes.
Related: #24695
What would you like Renovate to be able to do?
Ideally have a consistent approach to file/path pattern handling.
Did you already have any implementation ideas?
Today we have the following configuration options relating to file names and paths:
excludeCommitPaths
: globfileMatch
: regexignorePaths
: glob, or any string patternmatchFiles
: exact match [not technically part of this discussion]matchPaths
: glob, or any string patternfileFilters
: globI think we can agree that the "any string pattern" approach is bad and can be deprecated. Then the decision is: glob, regex, or both?
There is no easy way I'm aware of to know for sure that a string is intended to be a glob or a regex because some strings are valid for both.
I think regex are more powerful, but also harder to get started at than glob. At the same time, glob patterns are not particularly intuitive to write once you go beyond "foo*", and there seems to be some inconsistency between implementations too. I'd want to stick with regex for
fileMatch
which then leans towards regex for others too.For ones which are only a glob today, we could maybe migrate them to equivalent regex. There was a glob-to-regex library now archived, so maybe they found it to be too difficult of a problem? Another possibility would be to rename such fields and deprecate the old ones so that people get warning messages, but that is not the most user-friendly if we require them to do the migration.
@danez I noticed that you are active on globs and have raised issuess here in the past. Do you have any ideas?