Closed squ1dd13 closed 1 year ago
Thanks for all the feedback, I'm glad you're interested :D
Since there are several points to address regarding SqrtAccuracy
, I'll just reply here rather to the individual comments. The main rationale behind allowing reuse of SqrtAccuracy
objects was that since the multiplier
is heap-allocated, there might be some cases where performing several sqrt_*
calls would be able to avoid heap-allocating a new multiplier
on each call. After having another look it seems like since multiplier
gets cloned anyway, this (premature) optimisation is unnecessary. I could benchmark it, but I suspect it's probably worth very little compared to other optimisations that might be possible within the main algorithm. If nothing else, this would simplify the API quite a lot.
I've made a bunch of minor tweaks (mainly internal) but I haven't done anything major to the API for now because I want to hear what you think about the above.
TL;DR I'm all for getting rid of SqrtAccuracy
entirely.
For fractions I think it's probably essential to force the user to recognise that they are only working with an approximation, hence the requirement for them to supply some sort of value specifying the accuracy they want the result to have. However, would it make sense for GenericDecimal
to also provide a sqrt
function, but perhaps one which uses the precision type to automatically pick an accuracy level?
This would be somewhat at odds with the current documentation, which states
Calculations do not use precision
although the other part of this,
but comparison does
is actually exactly what the implementation of sqrt
is doing already. We use a truncated comparison between the square of our approximation of the root and the original value we were given to determine when to stop iterating:
// Stop and yield the current guess if it's close enough to the true value.
if squared_and_truncated == truncated_target {
break SqrtApprox::Rational(current_approx);
}
Edit: I should probably clarify that I've never been 100% sure on how to use GenericDecimal
(vs. GenericFraction
), so I what I said above may be complete gibberish :P
@squ1dd13, I can't reply in threads to the posts in the general comment section, so I'll quote you in this reply to be more specific about things.
Since there are several points to address regarding SqrtAccuracy, I'll just reply here rather to the individual comments. The main rationale behind allowing reuse of SqrtAccuracy objects was that since the multiplier is heap-allocated, there might be some cases where performing several sqrt_* calls would be able to avoid heap-allocating a new multiplier on each call. After having another look it seems like since multiplier gets cloned anyway, this (premature) optimisation is unnecessary.
That makes a lot of sense. I agree that it sounds reasonable to pass a link to a pre-allocated instance and do not destroy it for future reuse. Also, I guess it could make sense to have a default lazy-static pre-allocated value for it as a fallback.
but I suspect it's probably worth very little compared to other optimisations that might be possible within the main algorithm. If nothing else, this would simplify the API quite a lot.
Perhaps, all possible optimisations have value regardless of their relative size. If they obscure API too much, then we may need to think about adding abstractions to simplify that (good thing Rust supports zero-cost abstractions really well :)
TL;DR I'm all for getting rid of SqrtAccuracy entirely.
I feel like that could be just a ref to GenericInteger in there.
However, would it make sense for GenericDecimal to also provide a sqrt function, but perhaps one which uses the precision type to automatically pick an accuracy level?
Decimal is implied to be a thin wrapper around Fraction, which contains extra info about how we want the fraction to be represented in a decimal format. E.g.
(1/3, precision=3) == 0.333
(1/3, precision=4) == 0.3333
The comparison takes precision into account precision to follow the "string" way of comparing numbers (e.g. 1/3 == 1/3, but 0.333 != 0.3333 while both of those might be represented as the same fraction internally).
Given that, sqrt itself is not producing a formatted representation of the decimal itself, but rather a calculated output, maybe precision
is not always be the best value for accuracy
for the sqrt. However, it is hard to tell without a real example.
I would say, up to you if you feel that is a good default or not.
all possible optimisations have value regardless of their relative size
Fair point. In this regard it absolutely makes sense to keep it in.
I see the reasoning behind allowing any GenericInteger
to be used over only specific SqrtAccuracy
values, but I worry that this could lead to misunderstandings of what the value actually represents. It would be understandable for someone unfamiliar with the API to assume that they were being asked for the number of decimal places they want the result to be accurate to, which is emphatically not what SqrtAccuracy::multiplier
is. The multiplier is 10^{number of decimal places}
.
Consider the following example, where accuracy
now takes any GenericInteger
:
let my_number: DynaFraction<u32> = DynaFraction::from_str("123456/654321").unwrap();
// 0.1886780341758861476247896674567987272302
println!("{my_number:.40}");
let root_a = my_number.sqrt_with_accuracy(40);
// The user expects this to be equal to `my_number` to 40 d.p.
let root_a_squared = &root_a * &root_a;
// 0.1886780341758861476247896674567988508748
// That's only 33 d.p.!
println!("{root_a_squared:.40}");
// Here's what the user should have done:
let root_b = my_number.sqrt_with_accuracy(BigUint::from(10_u8).pow(40));
// They could also have done `my_number.sqrt(40)` with the current API.
let root_b_squared = &root_b * &root_b;
// 0.1886780341758861476247896674567987272302
// 40 d.p., as expected.
println!("{root_b_squared:.40}");
(In the above, root_b_squared
is actually equal to my_number
to 67 decimal places, but only 40 are guaranteed by the SqrtAccuracy
passed.)
Of course it would be possible to change the API such that foo.sqrt_with_accuracy(n)
would produce a result whose square* is equal to foo
to n
decimal places, but that would just mean computing BigUint::from(10_u8).pow(n)
inside sqrt_with_accuracy
, which is what SqrtAccuracy::new
is already doing, but outside the method so that the same value can be reused for multiple calls. This nuance is not ideal, but it's an unfortunate side effect of how we (currently) test values to see if our approximation is close enough.
I don't like this strange behaviour – it would be much better if the accuracy referred to the actual square root value rather than the square of the square root. That is, x.sqrt(n)
should give an approximation of the square root of x
that is accurate to n
decimal places. Currently that same expression gives an approximation whose square* is accurate to n
decimal places, which is confusing and weird. With some significant maths help from a friend I think it's possible to fix this, and I'll give it a go later.
It's probably worth mentioning that there are possible uses for non-power-of-10 multipliers
. In general, a multiplier
of a^n
means that the square of the result will be correct to at least n
digits after the a
-ary separator (i.e. the base-a
equivalent of the decimal point) in the base-a
representation of the result. For example, a multiplier of 2^40
means that if we square the approximated root and print it as a binary string, it will match the binary string representation of the original value for 40 digits after the .
:
let x: DynaFraction<u32> = DynaFraction::from_str("123456/654321").unwrap();
// 0.0011000001001101001101000010001001000010
print_in_binary_to_40(&x);
let s = x.sqrt_with_accuracy(&SqrtAccuracy {
multiplier: BigUint::from(2_u8).pow(40),
});
let sq = &s * &s;
// 0.0011000001001101001101000010001001000010
// Matches to >= 40 digits beyond the binary point.
print_in_binary_to_40(&sq);
Although I'm sure most people will only care about decimal, the fact that there is some meaning to using other bases for accuracy
means there probably should be finer control given over the multiplier
used.
I somehow managed to miss your previous mention of using lazy_static
to preallocate certain accuracy levels – this sounds like a great idea.
At the moment I'm not sure about the idea of making accuracy
an Option
and choosing a default value when None
is provided. I think implementing Default
for SqrtAccuracy
definitely makes sense, and then the user would be able to just do foo.sqrt_with_accuracy(&SqrtAccuracy::default())
(or possibly without the reference, because I think I'll change accuracy
to be impl Borrow<SqrtAccuracy>
so that it can be used with either). This forces the user to interact with the SqrtAccuracy
type, which should hopefully make clear the fact that this is not a lossless operation, and that they must agree to some level of compromise between performance and accuracy. If we hide this behind foo.sqrt_with_accuracy(None)
, the user doesn't have to acknowledge that compromise. Additionally, I feel (and this is pure speculation on my part) that most users of this crate will be concerned enough about accuracy that they would be able to make a reasonable decision about the level of accuracy that they want, and we should perhaps be encouraging them to make this decision so that they can get the most out of the sqrt_*
API for their specific needs.
Edit: I've made significant changes to the accuracy stuff; see the commit immediately after this message.
Thanks for your clear explanation of the decimal
API – it makes a lot more sense to me now. I think it's probably best not to use the precision to pick a default accuracy, so I guess the decimal approximations API will most likely be (near) identical to the fraction one.
Hi @squ1dd13,
Thank you for the detailed explanation. Yes, agree, having SqrtAccuracy
type makes sense. Also agree that we can have ::default
instead of None
. My thinking about the Option
was to use a _lazystatic value in case of None
instead of allocating a new value for each call. Perhaps, we could achieve something similar within the SqrtAccuracy internal logic. Not sure if that's worth pursuing though. I wouldn't mind accepting the patch without that optimisation.
I also feel that might be important to capture the details from your above post in the documentation for the SqrtAccuracy structure and the sqrt function.
My thinking about the Option was to use a lazy_static value in case of None instead of allocating a new value for each call. Perhaps, we could achieve something similar within the SqrtAccuracy internal logic.
Is this not covered by the behaviour of the Accuracy
type? When lazy_static
is enabled, the user can make use of the Dp20
, Dp100
and Dp500
variants, which do not store a BigUint
and instead use pre-allocated lazy_static
BigUint
values:
/// Returns a reference to the multiplier used by `self` to chop off irrelevant digits.
pub fn multiplier(&self) -> &BigUint {
#[cfg(feature = "lazy_static")]
{
lazy_static! {
static ref DP20_MUL: BigUint = BigUint::from(10_u8).pow(20_u32);
static ref DP100_MUL: BigUint = BigUint::from(10_u8).pow(100_u32);
static ref DP500_MUL: BigUint = BigUint::from(10_u8).pow(500_u32);
};
return match self {
Accuracy::Dp20 => &DP20_MUL,
Accuracy::Dp100 => &DP100_MUL,
Accuracy::Dp500 => &DP500_MUL,
Accuracy::Custom { multiplier } => multiplier,
};
}
// -snip-
let Accuracy::Custom { multiplier } = self else {
// -snip-
};
multiplier
}
The current Default
implementation for Accuracy
is just Dp100
, so if the user passes Accuracy::default()
, there will be no allocation. If lazy_static
is disabled, only the Accuracy::Custom
variant is available[^1]. (This is better than leaving the DpX
variants enabled and adapting multiplier
to allocate a value for them on-the-fly, because ::Custom.multiplier()
doesn't need to allocate, but the DpX
variants don't store a BigUint
inline so without lazy_static
they will always have to allocate a new one when it is needed.)
[^1]: This means Accuracy::default
needs lazy_static
at the moment. It might be more ergonomic to just change default
so that it allocates a new Accuracy::Custom
when lazy_static
is off, but I'm not sure how comfortable I feel with hiding a heap allocation behind a default
call.
I also feel that might be important to capture the details from your above post in the documentation for the SqrtAccuracy structure and the sqrt function.
Absolutely. I'll write up full documentation for each of the sqrt*
methods and include this. As SqrtAccuracy
has become the more general Accuracy
, I think the specific way in which the Accuracy
value passed is used should be up to the individual approximations implemented in the module, and not the documentation of the Accuracy
type itself. (At the moment the only approximation is sqrt
, but Newton's method could be used for fractional pow
amongst other things. The reason I've only implemented sqrt
so far is because it's ubiquitous enough that it makes sense to implement it by itself to take advantage of the specific optimisations that it allows. That is, halve_in_place_pos_rational
and any other specialised stuff.)
Is this not covered by the behaviour of the Accuracy type? When lazy_static is enabled, the user can make use of the Dp20, Dp100 and Dp500 variants,
Yes, that looks good. I was writing that comment before reviewing the latest changes to the PR, sorry for the confusion.
Accuracy::default needs lazy_static at the moment
Totally happy with that. There are no strong arguments for not having lazy_static
as a dependency. We also have it now required by with-bigint
feature anyway. And bigint is a hard requirement for the approx module. Thus, lazy_static is an indirect requirement of approx anyway.
There are no strong arguments for not having
lazy_static
as a dependency. We also have it now required bywith-bigint
feature anyway. And bigint is a hard requirement for the approx module. Thus, lazy_static is an indirect requirement of approx anyway.
I keep forgetting this :P I'll leave the explicit requirement for lazy_static
on those variants in case we end up changing things to make with-bigint
optional for approx
in the future.
I think the PR is nearly done. I'll write some more tests so that we have all the special cases covered, and then write up the documentation for all of the methods. Please do say if there's anything that the docs leave unclear – I'm used to the inner workings of the module now so there's a chance I'll miss out bits that I forget might not be obvious.
Not directly related to this PR as such but still potentially important is that there are a couple of issues I noticed with GenericFraction<T>
allowing signed T
. I think the best summary of this is given by the implementation of sqrt_with_accuracy_raw
as of yesterday, but in short: stuff like GenericFraction::<T>::signum
, ::is_sign_positive
, ::is_sign_negative
, etc. cannot be trusted for signed T
.
(On looking a second time, it would appear that this is not an issue for fractions made using new_generic
etc. because there's code that would explicitly remove the signs from the num/denom. Since GenericFraction
can be created directly from a Ratio
this isn't a non-issue, but it's not as bad as I thought.)
That's everything I can think of for now. I've made sqrt
its own module under approx
so that it has module-level documentation where all of the concepts (the whole accuracy thing, raw
, with_accuracy
, abs
) can be discussed in one place. This makes documenting the individual methods much easier.
I plan to open at least one more PR with other approximations, so it would make sense for those to then be their own modules within approx
. Accuracy
is a child of approx
, so it would be available to all of the approximation modules.
Not directly related to this PR as such but still potentially important is that there are a couple of issues I noticed with GenericFraction
allowing signed T. I think the best summary of this is given by the implementation of sqrt_with_accuracy_raw as of yesterday, but in short: stuff like GenericFraction:: ::signum, ::is_sign_positive, ::is_sign_negative, etc. cannot be trusted for signed T.
This is true, the negative numbers within the Ratio data (either num or denum) would lead to destructive consequences in many scenarios. Partially, this is neglected because the GenericFraction implementation keeps the sign out as a separate attribute: enum ... Rational(Sign, Ratio<T>)
. Because of that, keeping signs within the ratio becomes redundant and unnecessary. However, in some edge cases we can't have static validation for the code to not have negative numbers in case T is signed.
That's unfortunate, but removing support for signed T feels like an overkill, which would only tackle hypothetical problems where developers control the object construction manually (thus new + new_neg
, new_raw + new_raw_signed
methods allowing the sign control).
Perhaps, we could improve some APIs there to make it transparently safer with negative numbers.
unsigned T
is considered to be the main scenario and this is not a problemsigned T
is only supported for flexibility, but also safe if users control sign on construction (choosing correct new/new_neg method)That running time had me confused for a minute :D I forgot to say at any point that the algorithm is only remotely "fast" when compiler optimisations are turned on, so I've been testing with --release
. Indeed it takes around 220s to run cargo test fraction::approx --features="with-approx"
on my machine; compare this to ~12s to run the same with --release
enabled[^1].
[^1]: Out of interest, I also tried testing with RUSTFLAGS="-C target-cpu=native"
, unrealistic as that may be. It reduces the time further down to just under 9s, which I think is just incredible. We're calculating five different square roots to 10, 100, 1k, 10k and 100k decimal places, using numbers represented as Vec
s, and it's capable of executing in that time. rustc
is a fantastic thing...
This difference is absolutely not ideal, because it's likely to catch users out. RFC 2412 should help with this, because we would be able to force speed optimisation for specifically the sqrt
module when the user doesn't supply any other optimisation parameters. For now, I guess the only solution for this aspect of the problem is to add a note to the documentation.
In terms of the testing time, would we be able to test with optimisations enabled? All we'd need is this:
[profile.test]
opt-level = 3
Alternatively, we could test using --release
, but that seems unnecessary and also has the issue of disabling debug assertions etc.
For now I'll add a note to the documentation and change the build config to optimise tests, but I'm open to reversing those changes and discussing other options.
Oops, forgot to check that. I've left the benches in but they'll now only run when with-approx
is enabled.
Released with 0.14.0
Great, thanks!
I recently found myself needing
sqrt
while using this crate, and after reading #60 I decided to have a go at implementing it myself.The implementation is quite long because it's the result of quite a lot of benchmarking to try and get decent performance, so I've added a new
approx
module for it. The API isn't very polished yet but the actual algorithms work fairly well and are able to produce very accurate rational approximations for square roots in relatively little time. I thought I'd just get an extra opinion on it before I spend time perfecting the API, which is why I've only submitted a draft PR.Let me know if you think this is worth finishing off. Cheers :)