becheran / wildmatch

Simple string matching with single- and multiple-wildcard operator
MIT License
77 stars 15 forks source link

Only limited Windows syntax #16

Open daniel-pfeiffer opened 9 months ago

daniel-pfeiffer commented 9 months ago

One has to follow your link to Wikipedia, to find out why you are implementing only half of wildcard syntax. Only that article admits to limiting itself to Windows' limited features.

Rust also runs on Linux and other systems implementing \\, \*, \?, [char-classes] and {multi,branches} (though Bash treats those a little bit apart.) You should be clearer about this!

As for your benchmark, you could also compare to regex-lite. And either give Unicode examples, or state that you skip it for performance reasons.

becheran commented 9 months ago

Thanks for the feedback.

Where exactly are you missing the documentation? Aren't the three bullet points in the readme explaining the whole feature set?

Not sure what you mean with the Linux Vs windows comparison? This is a library that does its matching job on both OS.

The benchmark did not intentionally not use utf8 character. Just did not think that it would make a difference. Might add them including the library you mentioned.

daniel-pfeiffer commented 9 months ago

Yes, you do mention what you support. But I just naturally assume all the things I mentioned to be part of wildcards. I guess anybody coming from the Unixy side of the world would feel the same. I wasn't even aware that Windows has such a crippled subset. And there are many ways to get a real Shell with richer syntax on Windows.

So you should explicitly mention that your limitations come from following native Windows. Or, if you're adventurous, complete your offering, maybe through a 2nd function.

becheran commented 9 months ago

OK. I see. You mean something like the glob syntax beeing documented here. I guess the glob lib does more or less what you want/need already.

I won't reference Linux, nor Windows because I don't want the wildmatch crate to be something that has anything to do with OSs. There are many CLI, shells and programs with text input out there and all (might) understand different things when it comes to wildcards. This lib does what it does and states it does in the docs. It is very fast, but comes with a limited featureset. Look for example at this pr there the understanding of wildcards was using a % and _ sign which was also new for me...

Not sure if I will extend it with the [], {} and [!] syntax one day. I guess what could be achived without changing much is the (optional) support of escape characters such as \.

becheran commented 9 months ago

I followed your suggesting and added the comparision to regex_lite to the benchmarks. Also upgraded all other libs to the latest version. The test results are very different from the last run because I upgraded my PC. I also wonder how the result would/can look like on other hardware. If in doubt and preformance is relevant for an application it does always make sense to do own benchmarks on the target platform/hardware.

https://github.com/becheran/wildmatch/commit/160f13a7a45993e02b29898bc388a0f1e3ef3b20