metosin / malli

High-performance data-driven data specification library for Clojure/Script.
Eclipse Public License 2.0
1.43k stars 204 forks source link

Add `:float` support #1055

Closed seanstrom closed 3 days ago

seanstrom commented 1 month ago

Summary

Notes

frenchy64 commented 1 month ago

Oh I see, I misread the docs. Do we even need to specify the bounds on cljs since it looks like the same as double? -------- Original Message -------- On May 17, 2024, 05:44, Sean Hagstrom wrote:

@seanstrom commented on this pull request.


In src/malli/generator.cljc:

@@ -454,6 +454,20 @@ {:infinite? (get props :gen/infinite? false) :NaN? (get props :gen/NaN? false)}) (-min-max schema options)))) +(defmethod -schema-generator :float [schema options]

  • (let [max-float #?(:clj Float/MAX_VALUE :cljs (.-MAX_VALUE js/Number))
  • min-float #?(:clj Float/MIN_VALUE :cljs (.-MIN_VALUE js/Number))

Do we want the min-float to be a negative number? Because I think (.-MIN_VALUE js/Number) would be the smallest positive number.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

seanstrom commented 1 month ago

Do we even need to specify the bounds on cljs since it looks like the same as double?

Yeah good point, we probably don't need to do it, so I've changed it to only set the bounds for the clj and allow cljs to behave as doubles. I've also made sure to allow for infinity to be generated for cljs too.

ikitommi commented 3 weeks ago

Would like to merge this, but not 100% sure if everything is resolved. What do you think @frenchy64 ?

frenchy64 commented 3 weeks ago

IMO it's a bit inconsistent with the rest of malli that the bounds are clamped on behalf of the user in the generator. Otherwise the rest looks good.

seanstrom commented 3 weeks ago

Happy to make more changes to this PR, though I'm not sure if I understand the suggestions. I mentioned in an earlier comment that there was a reason to clamp the bounds by default when generating floats. The main reason is that calling (fmap float) after using gen/double* can accidentally create exceptions when coercing the 64-bit double into a 32-bit float.

Here's an example error message:

Exception: java.lang.IllegalArgumentException: Value out of range for float: -2.631375492124477E72

To avoid these exceptions I've included the default clamping of the bounds, because it would seem unstable to accidentally generate exceptions in test suites. There may be another way of avoiding exceptions, if so let's discuss some alternatives.

cc @ikitommi @frenchy64

seanstrom commented 3 weeks ago

One thing to note, I've recently seen this documentation on MDN about creating 32-bit floats with Math.fround in JavaScript. Some interesting things came to mind that could help:

seanstrom commented 3 weeks ago

Based on some Java documentation, it seems we can use [the .floatValue method on a double in Java](https://developer.android.com/reference/java/lang/Double#floatValue()). This seems somewhat similar to Math.fround because it reduces the data of the 64-bit float into a 32-bit float. It also seems to return Infinity and -Infinity for numbers that are out of the minimum and maximum range.

Based on that these two functions behave similarly and support 32-bit floats on both CLJ and CLJS, I'll try moving forward with using both of these techniques when generating :floats in Malli.

frenchy64 commented 3 weeks ago

Are there other examples of truncation in malli that you are following? To my taste this should mirror bounding a double with a bigint, what happens there? Or a hypothetical bytes generator.

I think it would be nice for the distribution of the generator to correspond to the bounds provided by the user and for the generator to always return a float. If the distribution is not possible like lower bound being too small or even nonsensical like between Double/MIN_VALUE and Double/MIN_VALUE+1, throw.

-------- Original Message -------- On 6/11/24 10:18, Sean Hagstrom wrote:

Based on some Java documentation, it seems we can use [the .floatValue method on a double in Java](https://developer.android.com/reference/java/lang/Double#floatValue()). This seems to something similar to Math.fround where it reduces the data of the 64-bit float into a 32-bit float. It also seems to return Infinity and -Infinity for numbers that are out of the minimum and maximum range.

Based on that these two functions behave similarly and support 32-bit floats on both CLJ and CLJS, I'll try moving forward with using both of these techniques when generating :floats in Malli.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

seanstrom commented 3 weeks ago

Are there other examples of truncation in malli that you are following?

I suppose my thought process is around: "what does it mean to generate a float"? I can understand that we don't want to clamp boundaries and override the users intent, but by default I think we should always generate valid 32-bit floats when the user doesn't specify a boundary. Otherwise the generator, built upon double*, could generate an invalid 32-bit float and throw an exception.

I would like to avoid throwing exceptions by default, which is what happens now if we don't include the clamped boundaries. The test suite for running the malli float generator will create errors because of the thrown exceptions.

I could specify the boundaries in the tests when using the float generator, but that would mean each time we go to use the float generator we would need to specify the default range of 32-bit floats. And that would seem tedious since we know the default range of a 32-bit float.

To my taste this should mirror bounding a double with a bigint, what happens there? Or a hypothetical bytes generator.

Are you asking what would happen if we try to coerce a BigInt into a 64-bit double? If so, the behaviour would be to create a double that has as much precision as it can allow, or otherwise return Infinity or -Infinity. Here's some related documentation: https://www.javatpoint.com/java-biginteger-doublevalue-method

I think it would be nice for the distribution of the generator to correspond to the bounds provided by the user and for the generator to always return a float. If the distribution is not possible like lower bound being too small or even nonsensical like between Double/MIN_VALUE and Double/MIN_VALUE+1, throw.

If someone inputs min/max boundaries that are outside the range/magnitude of the 32-bit float then sure we can decide to throw. However, I don't think we should throw by default when the boundaries are not specified.


Based on your notes, I think we can agree that we should throw when the user provides boundaries that are unsupported by 32-bit floats. Can we also agree that we don't want to throw exceptions when no boundaries are provided to the float generator?

frenchy64 commented 3 weeks ago

Based on your notes, I think we can agree that we should throw when the user provides boundaries that are unsupported by 32-bit floats. Can we also agree that we don't want to throw exceptions when no boundaries are provided to the float generator?

Exactly.

seanstrom commented 3 weeks ago

Okay great! I've pushed up a commit with that change, when you have a moment can you confirm that it's okay please 🙏

ikitommi commented 3 days ago

Thanks @seanstrom for doing and @frenchy64 for the review 🙇