Closed KarlTjensvoll closed 6 years ago
@abjer Thank you, I had a look at that site earlier, but I could not really figure it out. Reading it again I see that it does say "Those with numbers in their name indicate the bitsize of the type", so int8 requires 8 bits and int16 require 16 bits?
@KarlTjensvoll in theory, yes but really no. The reason is python/numpy has to add some overhead when storing stuff to keep track of what it's storing. (This is exactly why low-memory applications are written in C or fortran and not python).
Numpy data has a .nbytes
method which gives you the number you expect so int8().nbytes
is 1 and int64().nbytes
is 8. The benefit of numpy is that this overhead doesn't explode with the size of your array, so using the getsizeof
method which gives a better image of the actual memory usage we can see
from sys import getsizeof
from numpy import int8, array
# No big difference an integer
getsizeof(int8()) # 25 bytes (numpy)
getsizeof(int()) # 24 bytes (base)
getsizeof(array(range(10**6))) # 8000096 bytes (numpy)
getsizeof(list(range(10**6))) # 9000112 bytes (base)
Im not really an expert in this but perhaps this gives more info: https://jakevdp.github.io/PythonDataScienceHandbook/02.01-understanding-data-types.html
I understand that default for an int is int64, but you can specify int8 which is less precise, but would take less memory and storage space I would imagine.
I was unable to find a list over the exact storage values for the different types, does someone have an url or a list?