servo / unicode-bidi

Implementation of the Unicode Bidirection Algorithm in Rust
Other
77 stars 33 forks source link

Use struct/impl for Bidi Level #27

Closed behnam closed 7 years ago

behnam commented 7 years ago

It's easy to review the changes per commit. Here's the overall summary:

1) Move the already-defined modules into their own files and put test functions into relevant modules, and simplify use calls for test submodules with use super::*.

2) Create a zero-cost abstraction for Embedding Level using single-value tuple struct, and move inquiry/mutation/helper functions there. Then add debug_assert!() checks on mutations to help with validating internal states. (More to be added into the algorithms when I get there.) get_bidi_class 3) Update character data table creation script, move download location out of src/, and move out the non-table part of the module into its own file with its own test submodule, and removed public access to data tables, as they should be considered internal detail and get_bidi_class() is all needed on the algorithm and public API level.

4) Add .rustfmt.toml and use rustfmt for all files in the project. (The data tables file is exempted with #[cfg_attr(rustfmt, rustfmt_skip)].

5) No logic changes in the algorithm and only data-type and formatting changes in the non-data modules: explicit, prepare and implicit.


This change is Reviewable

emilio commented 7 years ago

r? @mbrubeck

behnam commented 7 years ago

I'm not sure what's the best way to test a servo build with this update (for the small API changes, specially some stuff going private) before merging. (I'm totally new to servo.) Would appreciate any pointers to test more and update before a deep review.

jdm commented 7 years ago

Building Servo locally and overriding the unicode-bidi dependency would be your best bet.

behnam commented 7 years ago

Thanks for the review, @mbrubeck. I've address your feedback.

I'm working on a servo patch to go with this, and will land this after I get that out and accepted. I'll add you as a reviewer there.

behnam commented 7 years ago

Regarding last commit: https://github.com/servo/unicode-bidi/pull/27/commits/f78bd0a5521e86fd992aecd414dc7504a7781351

RFC!

The reason to do this is to use this struct in the Line struct under servo/components/layout/, which is Serialize-able already and uses u8 at the moment.

So, do you think this big dependency worth it by itself?

If not, how about have conditional #derive with module availablity, so whenever serve+serde_derive are available, also make it Serialize-able, otherwise not.

Or, should use serde's "remote derive" in servo/components/layout/ and not bother about it here?

emilio commented 7 years ago

Looks sensible to me, though perhaps we should derive it on more stuff other people may find useful?

behnam commented 7 years ago

Right, @emilio. I'm updating dependent libraries now and find what are the common use cases to cover.

And while we're on that, I noticed that the name Level is not descriptive enough in other context, so I'm using Level as BidiLevel in almost all places. I think I saw an RFC that was accepted a few month ago recommended this method, instead of prefixing terms in the source library. But, on the other hand, we already have a bunch of public labels prefixed with Bidi.

So, what do you think about the naming? Should we keep it simple, as the RFC recommends, or for the pub use convert them to something more useful for the callers?

nox commented 7 years ago

Let's not depend on serde here, and instead serialize the level as a bare u8.

nox commented 7 years ago

I don't understand the rationale with making the single field of Level private, but still keeping the is_valid method, or making the other methods panics in debug mode.

Seems to me like the raise methods and whatnot should return Result<Self, ()>.

Could this PR be split in multiple ones? Seems like a lot of unrelated changes and I would like each set of these unrelated changes to actually go through CI, not all at the same time.

Also, the version bump commit is in the middle of the PR, which also looks wrong to me.

behnam commented 7 years ago

Thanks for the feedback! So, to make things easier to move forward, I'm going to submit another PR with all the good stuff here except the Level struct, which is the big API change here. Do you prefer a new separate PR for the Level, too, or should I force-push a new stack to this PR?

About Level itself, yeah, looks like I can improve the design and have better invariants for it.

Let me finish these two steps and get back to you.

One more general question, though: about naming functions/methods: I noticed that most of servo code-base prefers very short function names, like bidi_class() instead of get_bidi_class(). Is there document/write-up about this common practice?

behnam commented 7 years ago

Regarding serde for Level, I think we better follow the common pattern that exist for other abstractions, like url, which comes with a url_serde crate (https://github.com/servo/rust-url/tree/master/url_serde) that has the serde dependencies and serialize_with/deserialize_with implementations.

Then, that, and returning Result<Self, Error> from a new pub fn from_number() will make sure that deserialized values are valid, or an error is returned.

behnam commented 7 years ago

And here's my new diff that implements Level as an always-valid value and failing new/mutation methods: https://github.com/behnam/rust-unicode-bidi/pull/2

bors-servo commented 7 years ago

:umbrella: The latest upstream changes (presumably #28) made this pull request unmergeable. Please resolve the merge conflicts.