jmrichardson / tuneta

Intelligently optimizes technical indicators and optionally selects the least intercorrelated for use in machine learning models
MIT License
410 stars 66 forks source link

n_jobs ignored #31

Open academe-01 opened 1 year ago

academe-01 commented 1 year ago

Following any example from this rep while I set n_jobs=1 I still have 100% load on my cores, each process occupies all cores (128 in my case). Any ideas why ?

vojta-kubin commented 1 year ago

Hi, I bet you mean the prune_df method. We have the same issue. IMO the issue is in usage of the dcor package. It didn't used to support any parallelization, but now it should and it is not being used that way in the method.

There is a discussion on this topic on stackoverflow where the creator of the dcor package explains the solution. https://stackoverflow.com/questions/49925718/parallel-calculation-of-distance-correlation-dcor-from-dataframe