NumCores parameter - Githubissues

AdrianAntico / AutoQuant

R package for automation of machine learning, forecasting, model evaluation, and model interpretation

GNU Affero General Public License v3.0

235 stars 43 forks source link

NumCores parameter #36

Closed lejarx closed 4 years ago

lejarx commented 4 years ago

Hi @AdrianAntico,

I'm running a machine with 36 cores and 64 gb ram.

However I notice that the runtime don't seem to be any faster than my laptop with 8 cores and 8 gb ram.

I've made sure to update the NumCores parameter though.

Is this a known issue? Thanks

AdrianAntico commented 4 years ago

@lejarx Which functions are you testing? How big is your data (row count / columns count)? Does the machine have GPU or not? If so, which type and how many?

lejarx commented 4 years ago

@AdrianAntico The machine spec is Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz 36 cores 64GB RAM

I'm running the Walmart dataset (380,380 rows) based on this blogpost https://www.remixinstitute.com/blog/why-machine-learning-is-more-practical-than-time-series-in-the-real-world/#.Xa6ex-gzZPY

But for some reasons, I don't see the expected increase in speed after running on the bigger machine.

lejarx commented 4 years ago

@AdrianAntico please ignore, I think the issue is with the instance itself. Thanks

AdrianAntico commented 4 years ago

@lejarx No worries. I would love to hear about the speedups. If you are utilizing the CARMA functions, you should see a nice speedup with CPU as CatBoost, XGBoost, and H2O all benefit with more cores. If you have a machine with GPU available, you can see an even greater speedup with CatBoost and XGBoost (if you installed them to enable GPU). The AutoTS function should see a bit of speedup due to the parallel execution with Arima (if stepwise is FALSE) and TBATS. If you want to parallelize the AutoTS builds (that is, build multiple models at the same time) you'll have to set that up yourself as I didn't write the script in the blog to do that.