tjtelan / git-url-parse-rs

Parser of git repo urls for Rust
MIT License
15 stars 6 forks source link

Panic when parsing URL with two consecutive "/" chars or empty path component #22

Closed obi1kenobi closed 2 years ago

obi1kenobi commented 2 years ago

First, thanks for your work on this crate!

I've been using it to follow projects whose repos are posted on HackerNews. Recently, someone posted the following link to HackerNews, which leads to a panic: https://www.heliosleep.com//

thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 1', /.../.cargo/registry/src/github.com-1ecc6299db9ec823/git-url-parse-0.4.0/src/lib.rs:238:39
stack backtrace:
   0: rust_begin_unwind
             at /rustc/532d3cda90b8a729cd982548649d32803d265052/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/532d3cda90b8a729cd982548649d32803d265052/library/core/src/panicking.rs:142:14
   2: core::panicking::panic_bounds_check
             at /rustc/532d3cda90b8a729cd982548649d32803d265052/library/core/src/panicking.rs:85:5
   3: <usize as core::slice::index::SliceIndex<[T]>>::index
             at /rustc/532d3cda90b8a729cd982548649d32803d265052/library/core/src/slice/index.rs:189:10
   4: core::slice::index::<impl core::ops::index::Index<I> for [T]>::index
             at /rustc/532d3cda90b8a729cd982548649d32803d265052/library/core/src/slice/index.rs:15:9
   5: <alloc::vec::Vec<T,A> as core::ops::index::Index<I>>::index
             at /rustc/532d3cda90b8a729cd982548649d32803d265052/library/alloc/src/vec/mod.rs:2531:9
   6: git_url_parse::GitUrl::parse
             at /.../.cargo/registry/src/github.com-1ecc6299db9ec823/git-url-parse-0.4.0/src/lib.rs:238:39

The affected line is this one: https://github.com/tjtelan/git-url-parse-rs/blob/main/src/lib.rs#L238

After re-reading the docs, I wasn't able to determine whether this is expected behavior or a bug. My expectation was that GitUrl::parse() is safe to call with arbitrary URLs and it would return Ok(...) if it could detect a GitUrl and Err(...) otherwise. But perhaps the intended spec is instead "you are expected to call GitUrl::parse() only on URLs you reasonably believe are actually pointing to GitHub and similar sites" and my code is misusing that call when it attempts it on arbitrary URLs sourced from HackerNews.

Thanks again for building this crate and for looking into this!

obi1kenobi commented 2 years ago

The URL https://www.heliosleep.com/ also seems to trigger the same panic.

tjtelan commented 2 years ago

Closed by #21