Closed gallogiulia closed 4 years ago
This seems to be an issue with Spark running out of memory. There are some helpful resources on configuring Spark here: https://blog.cloudera.com/how-to-tune-your-apache-spark-jobs-part-2/ http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/
one particular configuration parameter that can be useful to adjust is: yarn_scheduler_maximum_allocation_mb
Hope that helps, tuning Spark can be difficult, I find it useful sometime to spin up a large cluster in Azure Databricks to figure out if the issue is a memory constraint or something else.
@gramhagen Thank you so much for your help with this. I will look at the links you provided
Description
I am using the als_deep_dive notebook and fiddling with the different size of the Spark instance, and the iterations of the ALS algorithm. Everything works until I increase the iterations from 15 to 20, or I change the memory from 16 to 32, in this line
spark = start_or_get_spark("ALS Deep Dive", memory="32g")
As an example, with max iterations = 20 and memory=32gb, I get the following error after running the Evaluation cell (SparkRatingEvaluation). The error is super long, so I am just pasting the errors that don't repeat.This is my environment:
Can you help me figuring out how I can increase the iterations with which I can train the ALS algorithm? Thanks