rust-num / num-bigint

Big integer types for Rust
Apache License 2.0
544 stars 189 forks source link

Add method for getting memory consumption of BigInt #98

Open Askaholic opened 5 years ago

Askaholic commented 5 years ago

It would be nice to have a method for getting the memory consumption of BigInts and BigUInts. In particular this crate is being used by RustPython, and the python spec requires builtins to implement a __sizeof__ method that returns the amount of memory used by an object in bytes. This is quite easy to calculate using std::mem::size_of_val, unfortunately it requires a reference to the underlying data Vec, which of course is not exposed by the BigInt api.

My proposal is to either:

  1. Add a method for getting a reference to data or
  2. Add a method for getting the size of BigInt/BigUInt

I would prefer the first option because it will support a lot more unforseen use cases.

See the discussion at https://github.com/RustPython/RustPython/pull/1172

cuviper commented 5 years ago

1. Add a method for getting a reference to data

I explicitly do not want to expose this. It would mean we could never make internal changes like optimizing the digit size for different architectures. Or at least we would have to abstract BigDigit as a truly opaque wrapper, rather than just a type alias as now, and make no guarantees about size_of.

2. Add a method for getting the size of BigInt/BigUInt

We have bits() already, but for memory usage I suppose you would like something based on the allocated Vec::capacity(), right? We could add something like this, whether in bits, bytes, or both. I would still not expose the capacity in terms of the internal digit size though.

Note that std::mem::size_of_val would not answer capacity if you're passing it a slice from a Vec. There is likely uninitialized data at the end that counts toward memory consumption too...

Askaholic commented 5 years ago

We have bits() already, but for memory usage I suppose you would like something based on the allocated Vec::capacity(), right?

Yes that's correct. Ultimately we want to know how many bytes of memory are being used by the object. We discussed using bits as an estimation, but it would only be a lower bound.

Note that std::mem::size_of_val would not answer capacity if you're passing it a slice from a Vec. There is likely uninitialized data at the end that counts toward memory consumption too...

Good point, because as_slice will will only return a slice to the valid part of the data. So the calculation would need to be done with std::mem::size_of<T>() * data.capacity()

windelbouwman commented 5 years ago

It would be nice to find some common trait for this, and implement this trait for the bigint type.

Example of something that might be of use: https://doc.servo.org/malloc_size_of/trait.MallocSizeOf.html

Example:

trait DynamicSizeOf {
    fn sizeof(&self) -> usize
}

impl DynamicSizeOf for BigInt {
    fn sizeof(&self) -> usize {
         // arbitrarily deep query on sub structures memory usage.
    }
}
tczajka commented 4 years ago

171 makes sure that memory use is never much larger than required. I think that could be important, especially since sometimes memory gets reused from inputs to outputs for various operations.