cerndb / dist-keras

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
http://joerihermans.com/work/distributed-keras/
GNU General Public License v3.0
624 stars 169 forks source link

Transformers cause out of memory errors on executors #27

Closed bensums closed 7 years ago

bensums commented 7 years ago

I think this is due to utils.new_dataframe_row copying whole rows. Why not use a UDF in the transform method? I observe this while doing a count on a dataset of only 6 million rows after transforming it with OneHotTransformer.

JoeriHermans commented 7 years ago

Did you set the dataframe persistence to STORAGE_AND_MEMORY? Because Spark will try to fit everything into memory by default (which includes the JVM overhead). That should resolve the issue. Remember it doesn't copy the dataframe row as a whole, but the references + a new column and value. Furthermore, what is the dimensionality of your one hot encoded vector?