dnsl48 / fraction

[Rust] Lossless fractions and decimals; drop-in float replacement
Apache License 2.0
83 stars 25 forks source link

Add better unicode support (Fraction slash/division slash) #100

Closed feefladder closed 9 months ago

feefladder commented 10 months ago

Since Rust supports full Unicode and because of the name of this crate, I think the fraction slash '⁄' is a better delimiter than the solidus '/' for display purposes. In any case, it should not be an error in parsing to use:

Here is a discussion I started on Rust internals. and here is a SO unicode discussion on fractions.

dnsl48 commented 10 months ago

Hi @feefladder, thanks for raising this.

Firstly, I don't believe we can simply replace ASCII with a Unicode character, because it will break backward compatibility for some existing integrations (e.g. serde serialization and Juniper integration are relying on the current behaviour. We would have to at least consider that behaviour opt-in via features or as an extension on the existing formatting flags, or via a wrapper object with overloaded format behaviour.

Secondly, I am trying to understand better the problem we are trying to solve. Semantics of the fraction expressed with unicode is not universally better than with ASCII, since the latter has better interoperability when passing serialized data around. If we start talking about more specialised human-facing environments (UIs), then not all of them implement unicode support either. Especially if we talk about IoT and devices with limited hardware capabilities. I would argue there is probably not enough real life value for switching to Unicode behaviour as the default.

feefladder commented 10 months ago

Hi @dnsl48

firstly: I agree on not breaking backward compatibility, maybe I should've made that more clear. I raised the formatting flags issue on rust internals, but there they didn't want to add this as an option, since it didn't allow for localization. They suggested using -> impl Display. e.g. something like

impl GenericFraction<T> {
  fn unicode() -> impl Display {
    todo!()
  }
}

fn main {
  let f = GenericFraction<u64>::new(1,2);
  println!("{} cup of m🩸lk", f.unicode());
}
// 1⁄2 cup of m🩸lk

would work and be intuitive for users.

Secondly: The issue was initially about the parsing, rather than the printing. e.g. plz allow me to parse a 1⁄2, since my user likes to type in unicode. I think allowing for more strings to be parsed should not be backwards incompatible. Only if someone depends on Fraction::FromStr("1⁄2").is_err(), this would break their program.

Per the docs, the Display trait is for user-facing environments. In the discussion, I asked for another trait to specifically specify that Display is also meant to be parsable, but this is the ToString that is automatically derived from a no-option display.

Will update my PR!

dnsl48 commented 10 months ago

hi @feefladder

They suggested using -> impl Display. e.g. something like

Yes, that seems to be a good middle ground. Won't break backward compatibility and may be used by those who want it. I would rather suggest us to have a new type wrapper, something like struct UnicodeFractionDisplay(GenericFraction<...>), that would implement the relevant display behaviour and from/into for fractions and/or decimals.

allowing for more strings to be parsed should not be backwards incompatible.

Agree. That, however, affects performance. Although in a negligible way, but in principle it would add more logical complexity and extra CPU instructions for every fraction parsing. Would probably also belong to the UnicodeFractionDisplay and its FromStr implementation?

feefladder commented 10 months ago

I asked on SO, and here's a good implementation I think: https://stackoverflow.com/a/77908229/14681457 And then put it into a different file like: https://stackoverflow.com/a/63378744/14681457

sidenote: There's also manyfmt crate, which allows for mutating the Formatter. I think it would make formatting code more concise and easier to understand for newcomers like me :)

feefladder commented 10 months ago

@dnsl48 PR is ready for review. If wanted, I can split the commits into branches.

feefladder commented 10 months ago

I'm not so much a fan of the parsing returning a struct. and unicode fraction being different from GenericFraction. Like, Decimal and Fraction - although convertible - are different concepts. "A fraction that can be parsed and printed to different unicode representations" is - to me - the same as a fraction. That's why I think unicode io should be part of the Fraction struct.

As for the parsing speed, I've made it accept many different forms. I think any sane program would not have its bottleneck in parsing human-readable fractions. That's user input that then does expensive calculations. For programs that do, there is the normal FromStr anyways.

I did also add display_mixed methods, e.g. 1¹/₂ or 1⁤1⁄2 which is indeed different and could be its own struct. Calculations on mixed fractions are also different, since there is an integer part. This could be beneficial for large numbers where otherwise the numerator would overflow.

dnsl48 commented 9 months ago

Thank you, that is all reasonable. I have left some feedback on the PR. We should probably also keep that logic behind a feature flag, since not all projects require unicode support for using fractions (especially in the lower level applications where it only needs to be serialized, but not displayed). Having it behind a feature flag would help keeping the binary size smaller by default. I suggest we could have a flag with-unicode for this.

dnsl48 commented 9 months ago

Released in 0.15.1