Open ooyanglinoo opened 6 years ago
fix the solver prototxt file, I suppose.
You could run the solver file on the single node version first, i.e. BVLC Caffe. Of course, you need to change the network prototxt file accordingly (switch out the data layer, etc.). If the single node version works, then you can try the grid version (switch back in the data layer, etc.).
Is there possible to solve the coredump problem by changing the code of CaffeProcessor of CaffeOnSpark?
If solver config file have some mistakes, cluster won't return failed soon, after a long time,return core dumps. How can I solve this problem.