Closed ssbr closed 2 months ago
Thanks for bringing this up!
It was quite a hack to begin with and I don't recall why it had to be kstring
here. And even though small-string optimisation should be possible, it definitely shouldn't be traded for safety.
Edit: I looked into this more and the idea was that it parses as zero-copy, but then typically converts to owned instances at some point. This is definitely where non-UTF8 could be exposed to places where UTF8 is expected.
It's worth noting that without kstring
, there is a 5% performance loss for an attribute-matching heavy workload. For now I replaced it with BString
, but hope to get to try a SmallVec implementation as well.
❯ hyperfine "/Users/byron/dev/github.com/Byron/gitoxide/target/release/gix index entries ':(attr:export-ignore)' -i" "gix index entries ':(attr:export-ignore)' -i"
Benchmark 1: /Users/byron/dev/github.com/Byron/gitoxide/target/release/gix index entries ':(attr:export-ignore)' -i
Time (mean ± σ): 820.1 ms ± 71.7 ms [User: 759.9 ms, System: 85.9 ms]
Range (min … max): 778.8 ms … 984.3 ms 10 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 2: gix index entries ':(attr:export-ignore)' -i
Time (mean ± σ): 782.1 ms ± 1.8 ms [User: 760.1 ms, System: 43.2 ms]
Range (min … max): 780.0 ms … 786.0 ms 10 runs
Summary
gix index entries ':(attr:export-ignore)' -i ran
1.05 ± 0.09 times faster than /Users/byron/dev/github.com/Byron/gitoxide/target/release/gix index entries ':(attr:export-ignore)' -i
I recommend that an informational RUSTSEC advisory be created to document that gix-attributes
was unsound--as described in this issue--prior to version 0.22.3.
This will help to notify interested users that they may wish to upgrade, and is part of what the RUSTSEC database tracks:
RustSec also tracks soundness issues as informational advisories, independent of whether they are vulnerabilities or not. A soundness issue arises when using a crate from safe code can cause Undefined Behavior.
This can be done by opening a pull request on the advisory-db repository in accordance with these instructions (which apply to informational advisories as well as those documenting vulnerabilities).
@ssbr I would be happy to contribute this. But I wanted to check with you first in case you want to do it. If not, I'll go ahead and open a PR and, unless you think it should be done differently, I would use text you wrote in this issue description and credit you in a Co-authored-by:
trailer in the commit message.
Thanks for bringing this up, I think it's the right call, too.
I suggest that you go ahead with making the advisory contribution, and @ssbr can collaborate there as/if desired.
I suggest that you go ahead with making the advisory contribution, and @ssbr can collaborate there as/if desired.
I have opened https://github.com/rustsec/advisory-db/pull/2027 for this.
Current behavior 😯
gix-attributes (in state::ValueRef) unsafely creates a
&str
from a&[u8]
containing non-UTF8 data, with the justification that so long as nothing reads the &str and relies on it being UTF-8 in the &str, there is no UB:The problem is that the non-UTF8
str
is exposed to outside code: first to thekstring
crate itself, which requires UTF-8 in its documentation and may have UB as a consequence of this, but also toserde
, where it propagates to e.g.serde_json
,serde_yaml
, etc., where the same problems occur.As far as I know, this is not sound, and either is or can cause UB down the line in these places that can view the
&str
.Expected behavior 🤔
I think gix-attributes should probably use a
Vec<u8>
orsmallvec
or similar, at least untilkstring
can support bytes.Absolute worst case, one could at least add an
unsafe
feature, so that code that opts out ofunsafe
can get the slowerVec<u8>
-based implementation which is guaranteed to be sound.Git behavior
N/A
Steps to reproduce 🕹
N/A