Open pferrel opened 7 years ago
here is the old implementation. Should I try putting this back in?
Yes, it would be good to compare total time and stage time of previous code. Looks wired. Maybe this is because of some laziness and some other calculations were attributed to this line?
On Nov 19, 2016, at 23:51 , Pat Ferrel notifications@github.com wrote:
Running on a large cluster and medium sized data (100Mb) this stage take 9.2 hours, by far the longest phase. Any ideas @laser13 @alexice ? This is not very large data and running on 4 r3.4xlarge AWS instances. We are only using popularity, no random or user-defined ranking.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Best regards, Alexey Pan'kov e-mail: alexicep@gmail.com phone: +7 981 891 2239
Running on a large cluster and medium sized data (100Mb) this stage take 9.2 hours, by far the longest phase. Any ideas @laser13 @alexice ? This is not very large data and running on 4 r3.4xlarge AWS instances.