quinngroup / dr1dl-pyspark

Dictionary Learning in PySpark
Apache License 2.0
1 stars 1 forks source link

Broadcast random seeds, rather than random vectors #63

Closed magsol closed 8 years ago

magsol commented 8 years ago

Broadcasting a random seed to all the workers to initialize the u vector will save a lot of network traffic each iteration. Each worker will independently generate the vector using the same seed that is broadcasted once at the very beginning.