servo / unicode-bidi

Implementation of the Unicode Bidirection Algorithm in Rust
Other
77 stars 33 forks source link

Add Conformance Test and implement L1 rule #30

Closed behnam closed 7 years ago

behnam commented 7 years ago

Add conformance tests using BidiTest.txt

Initial results: 60494 test cases failed! (196253 passed)

Because of the current limitations of rust's test framework, and the huge number of failures, the base (not including matching pairs) conformance test is executed in one run, and a summary is panic'ed (if there's any failure) and for the integration test to pass, it's marked as should_panic, with summary of the test run as expected string.

Fix #1

Implement L1 rule

To be able to implement L1, we need access to more information from BidiInfo, namely original_classes of the text, in visual_runs(), which would mean it should pass through reorder_line().

The fact that information from BidiInfo is needed for both steps of the public API (generating BidiInfo and consuming it per-paragraph/per-level) made me change the API design and move these methods into impl BidiInfo.

Then, since we needed access to text for every BidiInfo consumption, I added a reference to text to BidiInfo, which also enables more compile-time checks for BidiInfo isntance not outliving the text in the user code.

NOTE: We are already breaking API in version 0.3.0 and paving for full spec support is a good reason to do so, IMHO.

The L1 rule works by one pass on the text of the line.

Conformance Test: this implementation reduces the number of failures from 60494 to 23770 (out of total 256747 cases).

Fix #2


This change is Reviewable

behnam commented 7 years ago

Actually, found a bug in my L1 code that just fixed, updating the conf-test results: 12827 test cases failed! (243920 passed)

behnam commented 7 years ago

Dependents status:

mbrubeck commented 7 years ago

@bors-servo r+

This is great, thanks! Do you have any more changes planned before we publish version 0.3.0?

bors-servo commented 7 years ago

:pushpin: Commit 6427532 has been approved by mbrubeck

bors-servo commented 7 years ago

:hourglass: Testing commit 6427532127a71110bc11698bf07e4968e7e924ac with merge 0fa0cfe5d44be953b3c0affbb8b28344045d78ff...

bors-servo commented 7 years ago

:sunny: Test successful - status-travis Approved by: mbrubeck Pushing 0fa0cfe5d44be953b3c0affbb8b28344045d78ff to master...

behnam commented 7 years ago

Thanks for the review, @mbrubeck. Here's what I have in mind right now:

  1. Make sure serde works and servo builds properly.
  2. Working on the left over rules and try to reach 100% conformance for the base test.
  3. Then add support for paired brackets tests and the rules.

I think I need a little help with the serde support, as looks like the current servo repo depends on serde 0.9. I have a logically backward-compatible serde 1.0 implementation with tests right now (working on the 0.9 version this week), but then I realized that we may not need special implementation and derive(Serialize, Deserialize) could actually be enough.

So, the question is, are there any cases where a serde data from unicode-bidi < 0.3 would be passed on to a serde deserializer of unicode-bidi >= 0.3?

Right now, I'm organizing serde implementation as two features: with_serde0 and with_serde1, but they won't be able to be enabled at the same time. So, do we need to be able to build unicode-bidi with support for both ~1.0 and <1.0 at the same time?

Also, I can see that https://github.com/serde-rs/legacy and relevant crates enable support for multiple versions of serde being linked, but I'm not sure how to do a similar thing for the serde_derive macros.

So, in short: if we only need either with_serde0 or with_serde1, and there's no need for cross-unicode-bidi-version compatibility, I can condition them into the library with short and easy derive attributes. Otherwise, I would need more details on the expectation.

mbrubeck commented 7 years ago

So, the question is, are there any cases where a serde data from unicode-bidi < 0.3 would be passed on to a serde deserializer of unicode-bidi >= 0.3?

No, I don't think we need to support this.