amusecode / amuse

Astrophysical Multipurpose Software Environment. This is the main repository for AMUSE
http://www.amusecode.org
Apache License 2.0
158 stars 100 forks source link

Writing an array of quantities? #188

Open rieder opened 6 years ago

rieder commented 6 years ago

If I have an array of quantities y:

print(y)
quantity<[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] kms>

and want to save it to disk, how do I do this while preserving units?

If I save it as a numpy array, it saves the individual elements as quantities, which is not what I want:

np.save('y.npy', y)
y = np.load('y.npy')
print(y)
array([quantity<0.0 kms>, quantity<0.0 kms>, quantity<0.0 kms>,
       quantity<0.0 kms>, quantity<0.0 kms>, quantity<0.0 kms>,
       quantity<0.0 kms>, quantity<0.0 kms>], dtype=object)

This way, the file also becomes much larger than it needs to be...

Is there an Amuse way of saving arrays so that they can be retrieved correctly, without taking much more space than a 'regular' numpy array of scalars?

arjenve commented 6 years ago

Hi Steven,

The quantities can be pickled, so if you are working with strings;

S = pickle.dumps(y)
print S
x = pickle.loads(S)
print x

You could also use some of the internal function in amuse to make a custom format, if you want I can also make an example for that...

rieder commented 6 years ago

Hi Arjen, That would be very helpful. Saving pickled data helps restore the array correctly, but the file is unfortunately still much larger than the unit-less file...

rieder commented 6 years ago

The most efficient way to store seems to be to save the data unit-less, and then manually re-add the unit. This is exactly what I want to prevent, since it introduces the risk of not using correct units after loading the data...

ipelupessy commented 6 years ago

the problem with the pickle seems that a pickled numpy array is quite big compared to raw binary (the unit overhead is small)

ipelupessy commented 6 years ago

on the other hand, you can use pickle with binary protocol (1 or 2): S = pickle.dumps(y,1) this is quite compact: 8726 bytes for 1000 double array+units.kms

ipelupessy commented 6 years ago

is the solution ok (pickle with binary protocol)? so we can close..

rieder commented 6 years ago

I don't think this is solved. The binary protocol makes the file more compact and properly restorable, but it is still

Maybe we should write an Amuse-aware version of these functions...