flying-sheep / rust-rst

a reStructuredText parser and renderer
https://crates.io/crates/rst
Apache License 2.0
98 stars 10 forks source link

last line must be allowed to start with `.. ` #30

Closed getreu closed 10 months ago

getreu commented 3 years ago

I included your renderer to my tp-note project. Thank you for sharing this. I noticed that the file does wrongly not render, when the last line starts with ... for example:

.. comment

or

.. _link: https://getreu.net

When I add some regular text behind, it renders again.

getreu commented 3 years ago

BTW: I mentioned your crate in a blog post: Jens Getreu's blog - Tp-Note learns RestructuredText

flying-sheep commented 3 years ago

Oh thank you!

About your test case: I think the problem was the lack of a trailing newline: rust-rst’s README also ends with a link target and parses fine.

I think #31 should fix it for my binary. You can fix it for your crate in a similar way.

Maybe I should throw an error in the parser if there’s no trailing newline? Or really fix this by replacing NEWLINE in the grammar with newline = _{ NEWLINE | EOI }

getreu commented 3 years ago

About your test case: I think the problem was the lack of a trailing newline

Yes, you are right. I observed the following:

It parses when:

  1. the last line starts with .. ended by one or more \n,
  2. the last line starts with regular text ended by zero or one \n.

It does not parse when:

  1. the first line is just \n,
  2. the last line starts with .. without trailing \n.
  3. the last line starts with regular text ended by more than one \n.

My current workaround is:

// To add a newline at the end, we need to copy here. No other choice.
let mut rest_input = rest_input.trim().to_string();
rest_input.push('\n');
let document = parse(rest_input.as_str()).map_err(|e| anyhow!(e))?;

The above to_string() followed by push() implies memory allocation and copying.

The following would avoid this and fix use case no. 4.:

Or really fix this by replacing NEWLINE in the grammar with newline = _{ NEWLINE | EOI }

To fix use case no. 3 and no. 5., I could just do:

let document = parse(rest_input.trim().map_err(|e| anyhow!(e))?;

Or, together with #32, it would look like:

let document = parse(rest_input.trim())?;

Or, if you'd allow use case no. 3 and no. 5 at your side, then it would even look like this:

let document = parse(rest_input)?;

If I could avoid trim() at my side, the user would get error messages with correct line numbers.

Resume

  1. This would help a lot:

    Or really fix this by replacing NEWLINE in the grammar with newline = _{ NEWLINE | EOI }

  2. For my use case it would be the best, if your crate allowed optional white space before and after the text input. This way the user gets correct error line numbers.
flying-sheep commented 3 years ago

Hmm, I tried doing sed -i 's/NEWLINE/nl/g' rst.pest and then adding this:

nl = _{ NEWLINE | &EOI }

But for some reason the tests then loop infinitely. Any idea why?

getreu commented 3 years ago

Thank you. Unfortunately, I am not familiar with with this parser. I am afraid I can not help. Maybe try to execute step by step with a debugger to identify the loop. This sometimes gives a hint.