Closed glhez closed 6 years ago
Actually, we also transpile to a regex pattern, see https://github.com/marschall/memoryfilesystem/blob/master/src/main/java/com/github/marschall/memoryfilesystem/PathParser.java#L95
When I look at the glob specification in FileSystem#getPathMatcher I have a hard time understanding what is meant by the {,**/}
group. Using ,
as a separator we have an empty string and **/
as a subpattern. I don't know what is meant by this.
Copying from OpenJDK is not OK from from a licence point of view. I don't know if copying from JimFS is OK from from a licence point (APL2 vs MIT) of view so I'd like to avoid this.
The {,**/}
is really equivalent to (|.*/)
: the javadoc says it: the { } characters are a group of subpatterns, where the group matches if any subpattern in the group matches. The "," character is used to separate the subpatterns. Groups cannot be nested.
To complete the analysis, it also work under bash (tried on version bundled in Git for Windows).
$ shopt -s globstar # enable the ** path expansion (eg: make it recursive)
$ echo {,**/}pom.xml | xargs -n1 echo # print all pom.xml
pom.xml
apache/commons-lang/pom.xml
github/assertj-assertions-generator-maven-plugin/pom.xml
github/assertj-assertions-generator/pom.xml
github/assertj-core/pom.xml
github/memoryfilesystem/pom.xml
github/versions-maven-plugin/pom.xml
Actually, we also transpile to a regex pattern, see https://github.com/marschall/memoryfilesystem/blob/master/src/main/java/com/github/marschall/memoryfilesystem/PathParser.java#L95
I've seen it. Not that it is wrong (except for the error :)), but I need an implementation being configurable and independent of the FileSystem
implementation.
For example, if you use True VFS, which does not create glob pattern, you need to use the default provider - which is Windows for my case. The glob pattern **/*.java
transpile to .*\\[^\\]*.java
.
When you use the corresponding path matcher against a TrueVFS Path (or even the default ZipFileSystem
), it fails: the Path
use /
as name separator. Thus, Pattern.compile(.*\\[^\\]*.java, Pattern.CASE_INSENSITIVE).matcher(path.toString()).matches()
fails.
After some more in depth test (I've finally had the time to do my implementation of GlobToPattern, except I need to handle the **
correctly), Bash works as I expected it to work, while Java default implementation is not so good.
With the following tree and the Bash option shopt -s globstar
:
The following pattern
**.java
: fails in Bash but sadly works in Java (translate to .*.java
)**/*.java
: finds File1.java, File2.java and File3.java. Java would only find File1 and File2, because the regexp would be .*/[^/]*.java
**/*/file/*.java
: finds File2.java.src**/*/file/*.java
: finds nothing.{**/,}*.java
finds File3 twice.{**/a/**/,}*.java
finds File1, File2 and File3src**/main
finds the java and resources folders**/main/**/*.java
finds File1 and File2s**/*.java
finds nothingsrc/[a-z]**/*.java
finds nothingThe Bash rational seems to only accepts **
between a path separator (like you are doing) except it translates it to a *
when the use is invalid.
Also, Ant defines the **
as a special directory name.
I released 1.1.0 to Maven Central which now compiles glob to regex. I'm not 100% sure I got the semantics right but it's a start.
After migration from JimFS to MemoryFS (I wanted to use only one memory FS), some of my test failed because of a glob pattern:
{,**/}.gitignore
. MemoryFS fails to parse the pattern because it first split it by the path separator and does not find the end}
in{,**
. In that case, an exception is thrown.On a side note, here are how JimFS, OpenJDK and Bash works:
for i in {,**/}.gitignore ; do echo $i; done
effectively prints.gitignore
and**/.gitignore
**
translate for.*
while*
translate for[^/]*
for POSIX and[^\\/]*
for Windows.There is also another problem which arise if we test
/a/b/.gitignore
against**/.gitignore
; MemoryFS (inGlobPathMatcher
class) verify if the path is absolute and the glob is absolute:**/.gitignore
should work as it is equivalent to.*/.gitignore
regexp.I don't remember (I've read the code last year because I've had problems with glob which worked fine on Windows but not on Linux - or the reverse - because of the path separator
\
vs/
) how OpenJDK handle the case of absolute path: from the depth of my memory, I think that/.gitignore
produce^\.gitignore$
and/.gitignore
does not match it**/.gitignore
produce^.*/\.gitignore$
and/.gitignore
match itIf I have time I'll try to work a glob2pattern implementation with some customization (eg: path separator !) - I also need it in my project as a maven independent pom.