williamleif / histwords

Collection of tools for building diachronic/historical word vectors
http://nlp.stanford.edu/projects/histwords/
Apache License 2.0
420 stars 92 forks source link

Add mmap_mode="c" to np.load() of .npy files in embedding.py #5

Closed okayzed closed 7 years ago

okayzed commented 7 years ago

using mode "c" means copy-on-write, so the files are loaded off disk using mmap but not re-written to disk when changed.

(for testing, i clear the fs cache using vmtouch tool: https://hoytech.com/vmtouch/)

okayzed commented 7 years ago

this may be a premature optimization, but since i've been running the program over and over, it's useful to me (and not changing results yet). thought i'd share it

okayzed commented 7 years ago

please disregard that last commit (with tsne viz) - i didn't realize it automatically would update this pull request with new commits. this pull request only contains one diff related to mmap

williamleif commented 7 years ago

Seems like a perfectly reasonably optimization! Thanks!