nathanmarz / cascalog

Data processing on Hadoop without the hassle.
Other
1.38k stars 178 forks source link

Overhead of push/pop thread bindings is high #284

Open ipostelnik opened 9 years ago

ipostelnik commented 9 years ago

My profiling showed that managing thread bindings for *op-call* and *flow-process* vars carries a lot of overhead. In my test these calls accounted for about 16% of flow execution, excluding IO (scheme input/task output). This was after I had applied the change from #283 to optimize null checks.

sritchie commented 9 years ago

@ipostelnik, yeah, this is a good catch. I wonder if we should set a flag in the config that would conditionally enable/disable this feature. Checking the config on each call would be far cheaper.

ipostelnik commented 9 years ago

285 also relies on this functionality.

I guess we could add some sort of metadata annotation to operations that want to use these vars. Alternatively, maybe we could change them from proper clojure vars to something like a ThreadLocal?