Closed Gitii closed 6 years ago
With the Featuretools on Dask
notebook implementation, it took about 3 hours to generate the feature matrix using a MacBook laptop with 16 GB of RAM. I'd recommend taking the Dask route instead of doing the entire computation at once.
The Dask implementation should run within a few hours on a personal laptop. It's a great example of the benefits of parallel computation!
Hi,
thanks for your article. Automated Feature Engineering is very promising. I am running the Loan Repayment script right now to compare it with my own engineered features. I am very curious about the results.
What is the recommended horse power to compute the result on one day (like mentioned in the article)?
Elapsed: 18:50:30 | Remaining: 22358:53:57 | Progress: 0%| | Calculated: 3/3563 chunks
The ft.py uses one job by default. Any other value but 1 crashes the script. I am using a r4.2xlarge aws ec2 instance. But with one job it cannot utilize more than one core. Nevertheless even with all eight cores, it would still take weeks.
Can you recommend some specs to speed this up?
Best regards