Closed WindSoilder closed 2 years ago
Thanks for reporting this!
This is definitely a bug in Wax. I believe the issue is the code here, which uses from_os_str_lossy
from the bstr
crate. That function lossily converts byte sequences into UTF-8 text by replacing invalid bytes with Unicode replacement characters, but only on non-Unix platforms. Given that the example code you've shared is using the std::os::unix
module, I'm guessing that the bytes are passed through unmodified to from_utf8
.
I plan to remove the dependency on bstr
, as it is hardly exercised at all. Rather than using from_os_str_lossy
, the likely fix will be to (unconditionally) use OsStr::to_string_lossy
instead.
Please note that Wax does not support paths that cannot be interpreted as UTF-8 text. However, this lack of support should only lead to two primary limitations: (1) glob expressions are UTF-8 strings, so non-UTF-8 literals cannot be represented or matched directly and (2) the matched text (captures) of non-literals may differ from the bytes of the matching path. Panics like this should not occur.
I just took a look at nushell/nushell#2987. Is the ls
command in this context using Glob::walk
?. Since there is no reported panic, it seems like it is not, but I'm not sure. If it is, I'd recommend using the walkdir
crate instead unless the ls
command also accepts or otherwise uses glob expressions other than *
and **
(or possibly if nushell is already using Wax elsewhere; Glob::walk
is currently implemented atop walkdir
).
Please note that Wax does not support paths that cannot be interpreted as UTF-8 text. However, this lack of support should only lead to two primary limitations: (1) glob expressions are UTF-8 strings, so non-UTF-8 literals cannot be represented or matched directly and (2) the matched text (captures) of non-literals may differ from the bytes of the matching path. Panics like this should not occur.
I'm fine with that, if *
can match non utf-8 path
, everything is fine
I just took a look at https://github.com/nushell/nushell/issues/2987. Is the ls command in this context using Glob::walk?
Yeah, that issue doesn't mention about Glob::walk
. But nushell has a glob command which make use of Glob::walk
.
This has been fixed in 5263d5a. That change removes the bstr
dependency and unconditionally uses OsStr::to_string_lossy
when constructing a CandidatePath
from an OsStr
or Path
.
Refer to: https://github.com/nushell/nushell/issues/2987 If we create a file from the code:
And run glob walk with the following code:
It makes the program panic, with the message says that: unexpected encoding: Utf8Error