Hi, I am successfully able to train glove vectors on RHEL desktop but when I am trying to run glove training on RHEL based Hadoop Cluster, it fails in glove.c
for(b = 0; b < num_iter; b++) {
fprintf(stderr,"Entry Point\n");
total_cost = 0;
for (a = 0; a < num_threads - 1; a++) lines_per_thread[a] = num_lines / num_threads;
fprintf(stderr,"First\n");
lines_per_thread[a] = num_lines / num_threads + num_lines % num_threads;
fprintf(stderr,"Second\n");
for (a = 0; a < num_threads; a++) pthread_create(&pt[a], NULL, glove_thread, (void *)a);
fprintf(stderr,"Third\n"); // Nothing executes after this
for (a = 0; a < num_threads; a++) pthread_join(pt[a], NULL);
fprintf(stderr,"Fourth\n");
for (a = 0; a < num_threads; a++) total_cost += cost[a];
fprintf(stderr,"Fifth\n");
fprintf(stderr,"iter: %03d, cost: %lf\n", b+1, total_cost/num_lines);
}
The training flow executes till printing Third (before call to pthread_join) and nothing prints after that.
I have tried setting num_threads = 1 as well but doesn't help. The exact same code works on desktop. Can you help?
I am calling various scripts of demo.sh from python using os.system() method.
Hi, I am successfully able to train glove vectors on RHEL desktop but when I am trying to run glove training on RHEL based Hadoop Cluster, it fails in glove.c
The training flow executes till printing
Third
(before call topthread_join
) and nothing prints after that.I have tried setting
num_threads = 1
as well but doesn't help. The exact same code works on desktop. Can you help?I am calling various scripts of
demo.sh
frompython
usingos.system()
method.