facebook / starlark-rust

A Rust implementation of the Starlark language
Apache License 2.0
709 stars 59 forks source link

Integers are bounded to 32bit #6

Closed ndmitchell closed 2 years ago

ndmitchell commented 3 years ago

The spec says:

Integers may be positive or negative, and arbitrarily large.

Currently, integers are limited to 32bit. We should make them arbitrarily large. Currently 32bit integers are packed into the Value pointer, which is something we want to continue doing, so they need to "overflow" into a Rust BigInt stored on the heap when necessary.

pierd commented 3 years ago

I had a quick look at crates.io and we could use num-bigint crate for this. It looks like the most mature BigInt implementation in Rust.

Also it's probably a good idea to go back to regular i32 when arithmetic operation's result fits in it.

ndmitchell commented 3 years ago

That all sounds very sensible to me, and looks like a suitable crate to base it off.

Yatekii commented 2 years ago

Could this be made configurable? E.g. feature flags. I would be interested specifically in 64bit and would quite dislike if it was dynamic like BigInt (application embedded devices).

Is there a big roadblock, other than work having to be put into it?

stepancheg commented 2 years ago

Could this be made configurable?

I think that would be right.

I also think we should have a config option to disable floats.

ndmitchell commented 2 years ago

Could this be made configurable?

Before making things like int/float configurable, I'd like to see the need. Given things like lists are already allocated on the tuple, does using BigInt for large integers (of which there are likely to be few) actually cause a problem for emebedded devices? Are there some targets which are both powerful enough to run Starlark, yet not powerful enough to have float?

Is there a big roadblock, other than work having to be put into it?

As to the whole BigInt, there is a bit of work around Rust-level API's to handle more types of number (but that has got easier since @pierd's work to add float) and then a bunch of work writing the numerics that overflow only when required. Nothing huge, but not trivial either.

Yatekii commented 2 years ago

The problem is not the MCU power. The problem is dynamic changes of the bit representation when working with registers. I do not want to employ this on a MCU but in https://probe.rs which flashes MCUs ;)

ndmitchell commented 2 years ago

@Yatekii - I'd be curious to understand more how the limitation to 32 bit helps there. Is it that you have a poke call or similar to set memory addresses, and they want to be bound? If so, wouldn't it be better if internally the values were all unbounded and you then asserted they were within a range at the call to poke? Or are you thinking of something else?

Yatekii commented 2 years ago

No, we care about single bits. Not about the entire number necessarily. If there is dynamic numbers we have no guarantee the bits do not get moved as is pleased by dynamic numbers. Actually I would prefer 64bit over 32bit even tho we currently only have 32bit support :)

ndmitchell commented 2 years ago

It sounds like you might be better off with a register or bits type that is exactly the right size and semantics (I imagine you want unsigned too) - there's no real requirement that you use the builtin int type. The only thing builtin int has is that literals written in the code are int, but that could be fixed (Haskell has overloaded numeric literals, for instance, Starlark could do similar) and might not even be necessary.

pierd commented 2 years ago

I was actually going to start the implementation of bigint support. I can put it behind a feature flag (especially since it'll add num-bigint crate increasing the size while not everyone will need bigint).

Given how we handle numbers currently (boxed i32, heap allocated f64) adding yet another type/size (say i64) is possible but not trivial - I would estimate it as similar amount of work as adding bigint support.

ndmitchell commented 2 years ago

We currently always depend on various libraries for LSP/DAP support, that are way bigger than num-bigint, so I wouldn't bother putting that dependency behind a flag. The approach was always to be fairly lax with dependencies until it causes people real problems, and num-bigint seems fairly safe as a dependency.

ndmitchell commented 2 years ago

This is now done.

Yatekii commented 2 years ago

Awesome, thank you!