sirmarcel / cmlkit

tools for machine learning in condensed matter physics and quantum chemistry
MIT License
34 stars 6 forks source link

Drop yaml dumper for son, replace with json #3

Open sirmarcel opened 5 years ago

sirmarcel commented 5 years ago

JSON is comically faster. Here is a benchmark for dumping/loading 2000 small dicts (3 repeats):

JSON:
{'times': array([0.38685818, 0.34857985, 0.34886317]), 'mean': 0.36143373133333334, 'min': 0.3485798470000001, 'max': 0.3868581750000001}
Yaml:
{'times': array([ 9.26613551, 10.8498244 ,  9.88484843]), 'mean': 10.000269449333334, 'min': 9.266135509, 'max': 10.849824405}

This really hurts when doing Run.checkout().

sirmarcel commented 5 years ago

For dumping numpy stuff with json, use the hilde stuff, https://gitlab.com/flokno/hilde/blob/cli/hilde/helpers/converters.py.