Closed Helveg closed 3 years ago
I tried to use np.savetxt
, but here I have some problems when saving array with different lenghts in a single data frame, for this reason I temporarily removed it from the script.
Use pickle
please :) You can store an arbitrary data structure eg:
import pickle
with open("save.pkl", "wb") as f:
_100_results = [run_simulation() for i in range(100)]
pickle.dump(_100_results , f)
Unless things changed recently your code runs the simulation, then analyzes it and discards the data. This is called "online analysis" and is good for high throughput of "mature pipelines", but when the cost of running simulations is high and you are still experimenting with the analysis you'd be better off saving the data from the simulations, and then performing iterative improvements of your analysis on that dataset. This is called "offline analysis" and means that you can run the simulations once, 7 hours, and then store all your data, then running the analysis scripts as much as you want in only a few seconds each time. If the analysis fails you improve the scripts and try again!!
So, could you try to improve your codebase to factor out simulation and analysis? You can store data using
pickle
ornp.savetxt
! I think pickling might give you more freedom to experiment. So try to set up 2 experimental scripts like this:Then take a look at how
modules
work in Python, and try to refactor your simulation code into an importable module so that you can do something like this: