fix some emphasis parsing

tatchi commented 2 years ago

Not ready for review yet. Opening it as a draft to give an overview of what I'm currently working on.

tatchi commented 2 years ago

Okay, I think I managed to fix all the broken tests mentioned in #244. At least tests are passing now and nothing else broke 😁

It's the first time I'm working on a parser, and that was not easy 🥵😅

Based on the expected outcome, I first tried to guess what should be the solution until I told myself that it should probably be a good idea to read the spec. First time that I'm reading a spec (or rather a part of it), and that was very helpful! 😁

There are 3 main commits:

https://github.com/ocaml/omd/pull/269/commits/b52495aff1004bdc7eb2ce52a779fa0b9014591e that fixes 410,411, 414 and 428
https://github.com/ocaml/omd/pull/269/commits/cceeb2c0b362fe1c3aee76345d8d455d0d9a2ce8 that fixes 415 and 416
https://github.com/ocaml/omd/pull/269/commits/7e6a712d473a2453f5f70333cf3ff9eb3c3c2a4e that fixes 468 and 469

I'm not very happy with the code of 1 and 3 so I'll see if I come up with something better. I'm planning on creating a separate PR for each of them so we can have a dedicated discussion for each. The fix is different anyway so I think it makes sense.

sonologico commented 2 years ago

This is great. Thank you. Splitting PRs would be good.

shonfeder commented 2 years ago

Yeah, this is really great work!

I agree that refactoring it into independent PRs could help keep things very clean.

One heads up is should we be aware of the potential with conflicts with #266. I'll work on trying to get that merged in the next day or two, which should help clear up that risk one way or another :)

shonfeder commented 2 years ago

I would be fine with review this from the current PR. While I appreciate the intention to keep the PRs clean and focused, this group of spec violations was already grouped into a single issue, and the changeset is small enough here, and close enough related, that I think it makes sense to review in one PR.

tatchi commented 2 years ago

Thanks for the review! 😊 I would have preferred to split that PR into 3 smaller ones, but now that you started the review process, let's continue here :)

I will address your comments and try to do my best to document my changes.

tatchi commented 2 years ago

I should have addressed your comments. I also added some comments to document the code I added. That wasn't super easy to explain so hopefully it's more or less understandable 😅

I'm wondering if the functions find_next_emph I introduced are ideal. Instead of looking at the "future" (i.e what we haven't read yet) to make a decision, I'm wondering if we couldn't continue to read in any case but store some info in the acc (i.e what was the last delimiter we read) to make a decision later on.

I'm not sure if that makes sense and if that's feasible. I'll try and see how far I can go.

shonfeder commented 2 years ago

I would have preferred to split that PR into 3 smaller ones, but now that you started the review process, let's continue here :)

I'm sorry about that! I didn't mean to push you into doing just one PR. I'll refrain from doing a review before you indicate you're ready in the future.

As soon as you are happy with this one, please convert it from a draft PR. All that's left to be done there is to add an entry into the changelog (I'll do that if you haven't beaten me to it by the time this is converted from a draft).

Thanks again for your excellent work here on fixing the parser! :)

shonfeder commented 2 years ago

Hi, @tatchi! Just checking here. If you're happy with the state of things, I'd be happy to make a small add here to update the changelog and merge this in. But I don't want to step on your toes in case you had been planning to do more work on this.

tatchi commented 2 years ago

Hi, @tatchi! Just checking here. If you're happy with the state of things, I'd be happy to make a small add here to update the changelog and merge this in. But I don't want to step on your toes in case you had been planning to do more work on this.

Hi @shonfeder, everything looks good to me except these two find_next_* functions that I introduced. I'm actually not sure if that matters much, but I have the feeling that it would be better not to use such functions that look ahead but instead store info about what we've parsed so far to make future decisions.

I'm not sure if that will be feasible but I would like at least to give it a try. Good news is that I'll have time to look at it this week, so expect an outcome by the end of the week 😁

tatchi commented 2 years ago

I discovered an emphasis parsing bug that was not covered by the conformance tests. I added an extra test in https://github.com/ocaml/omd/pull/269/commits/46d6a612b497c8653548a803198d1f215cc6f628 with the fix in https://github.com/ocaml/omd/pull/269/commits/6e17dd3211cbe4b56bd4f7be9ea26a7c274e0e89

For the rest, I haven't found a way to remove these two find_next_* functions, but I don't think it's important.

I marked the PR as ready for review so it can be merged now 😁

shonfeder commented 2 years ago

Many thanks for the great work here, @tatchi!

ocaml / omd

fix some emphasis parsing #269