djsutherland / pummeler

Utilities to analyze ACS PUMS files, especially for distribution regression / ecological inference
MIT License
21 stars 7 forks source link

faster featurizer #5

Open djsutherland opened 8 years ago

djsutherland commented 8 years ago

The old Cython featurizer only took two minutes on low1 once dummies had been created; this new one takes two hours. Dunno how long dummies took, but not two hours.

What's the significant difference? Speed that up....

djsutherland commented 8 years ago

One thing that might help is to call MKL straight instead of through numpy: gemm would save a dot-then-scale, sincos is much faster than np.sin plus np.cos (https://github.com/numpy/numpy/issues/2626#issuecomment-235785553), probably some other savings to be had in rff_embedding (or use fastfood).