servo / html5ever

High-performance browser-grade HTML5 parser
Other
2.14k stars 222 forks source link

Convert BufferQueue to use Interior Mutability #542

Closed Taym95 closed 3 months ago

Taym95 commented 4 months ago

Part of https://github.com/servo/servo/issues/32454

from @jdm comment

Taym95 commented 4 months ago

These changes make sense for BufferQueue's internals, but we should also be able to update the usages of &mut BufferQueue in the rest of the library. We can then verify that the panic from the testcase in servo/servo#32454 (comment) should not occur.

I tried to do that but Iterator is issue here, next(&mut sel)

impl Iterator for BufferQueue {
    type Item = char;

    /// Get the next character if one is available, removing it from the queue.
    ///
    /// This function manages the buffers, removing them as they become empty.
    fn next(&mut self) -> Option<char> {
        let (result, now_empty) = match self.buffers.borrow_mut().front_mut() {
            None => (None, false),
            Some(buf) => {
                let c = buf.pop_front_char().expect("empty buffer in queue");
                (Some(c), buf.is_empty())
            },
        };

        if now_empty {
            self.buffers.borrow_mut().pop_front();
        }

        result
    }
}

and it is neeed 2 places:

    fn eat(
        &mut self,
        input: & BufferQueue,
        pat: &str,
        eq: fn(&u8, &u8) -> bool,
    ) -> Option<bool> {
        if self.ignore_lf {
            self.ignore_lf = false;
            if self.peek(input) == Some('\n') {
                self.discard_char(input);
            }
        }

        input.push_front(mem::take(&mut self.temp_buf));
        match input.eat(pat, eq) {
            None if self.at_eof => Some(false),
            None => {
                self.temp_buf.extend(input);
                None
            },
            Some(matched) => Some(matched),
        }
    }
```
    fn eat(&mut self, input: & BufferQueue, pat: &str) -> Option<bool> {
    input.push_front(replace(&mut self.temp_buf, StrTendril::new()));
    match input.eat(pat, u8::eq_ignore_ascii_case) {
        None if self.at_eof => Some(false),
        None => {
            self.temp_buf.extend(input);
            None
        },
        Some(matched) => Some(matched),
    }
}
``
jdm commented 3 months ago

Sorry, I missed your last comment when you wrote it. What's the code that actually uses BufferQueue as an iterator?

jdm commented 3 months ago

Ah, I see:

error[E0277]: `&BufferQueue` is not an iterator
   --> html5ever/src/tokenizer/mod.rs:340:51
    |
340 |                 self.temp_buf.borrow_mut().extend(input);
    |                                            ------ ^^^^^ `&BufferQueue` is not an iterator
    |                                            |
    |                                            required by a bound introduced by this call
    |
    = help: the trait `Iterator` is not implemented for `&BufferQueue`, which is required by `&BufferQueue: IntoIterator`
    = note: `Iterator` is implemented for `&mut BufferQueue`, but not for `&BufferQueue`
    = note: required for `&BufferQueue` to implement `IntoIterator`
note: required by a bound in `extend`
   --> /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/iter/traits/collect.rs:448:5
help: consider dereferencing here
    |
340 |                 self.temp_buf.borrow_mut().extend(*input);
    |                                                   +

error[E0596]: cannot borrow `*input` as mutable, as it is behind a `&` reference
   --> html5ever/src/tokenizer/mod.rs:218:21
    |
218 |                     input.next();
    |                     ^^^^^ `input` is a `&` reference, so the data it refers to cannot be borrowed as mutable
    |
help: consider changing this to be a mutable reference
    |
210 |     pub fn feed(&self, input: &mut BufferQueue) -> TokenizerResult<Sink::Handle> {
    |                                +++
Taym95 commented 3 months ago

Sorry, I missed your last comment when you wrote it. What's the code that actually uses BufferQueue as an iterator?

Sorry 🙏 , I am not getting notification from this repo, yes the code you shared! any ideas?

jdm commented 3 months ago

I suspect we'll need to use my idea of accepting an argument that can obtain a mutable reference on demand. We might want to try an enum that is either &mut or RefCell and add a method like: fn mutate<T, F: FnMut(&mut BufferQueue) -> T>(f: F) -> T which will call the provider function with the needed mutable reference.

jdm commented 3 months ago

Actually, I think the easiest solution is to extract the next() method from the impl Iterator for BufferQueue so it's just a normal method (and can therefore be &self instead of &mut self). Then there are only two places in the code that actually try to use it as an iterator, and they can be rewritten.

jdm commented 3 months ago

Thanks! I'll pull the latest change into my branch and run tests with it.

jdm commented 3 months ago

Verified that these changes are correct via https://github.com/servo/html5ever/pull/548 and https://github.com/servo/servo/pull/32820, so let's merge this.