egg-mode-rs / egg-mode

a twitter api crate for rust
https://crates.io/crates/egg-mode
Mozilla Public License 2.0
371 stars 65 forks source link

codepoints_to_bytes is buggy #38

Closed adwhit closed 6 years ago

adwhit commented 6 years ago

Hi, thanks for this library. Recently I have jumped in to the codebase to hopefully add a few more features.

Now I might be misunderstanding this, but it appears that the codepoints_to_bytes function is wrong. Here is a failing test I put at the bottom of common/mod.rs

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_codepoints_to_bytes() {
        let unicode = "frônt Iñtërnâtiônàližætiøn ënd";
        // suppose we want to slice out the middle word.
        // 30 codepoints of which we want the middle 20;
        let mut range = (6, 26);
        codepoints_to_bytes(&mut range, unicode);
        assert_eq!(&unicode[range.0..range.1], "Iñtërnâtiônàližætiøn");
    }
}

The problem is that *start is being mutated in place over and over again pushing it way past where it should stop. The same is true of *end although that doesn't show in this test.

The corresponding assertion at the end of the parse_basic is wrong also. It picks out the whole string but I think it should not pick the hyperlink at the end of the text.