Closed ym-han closed 3 years ago
Workers not connecting does not sound like a FileTrees issue.
Have you tried running something on the workers without using FileTrees or dagger, for example pmap(i -> (sleep(1); myid()), 1:20))
just to see that workers are really setup correctly?
Thanks for pointing that out. I hadn't done that (I'm not very familiar with the Julia distributed computing stack). I'll close this for now.
I got the following error while trying to run FileTrees with multiple workers on an embarrassingly parallel problem. The code I was running is https://github.com/ym-han/gigaword_64k/blob/main/src/gigaword_64k.jl and https://github.com/ym-han/gigaword_64k/blob/main/src/afp_mapper.jl It's basically just filtering a tree, loading it, processing it, then saving it.
I did not experience any issues when doing it with just one node (i.e., without the parallelism). One thing i haven't tried yet is running this with even fewer workers (I had 5 - 6), since the amount of data I was processing wasn't actually that large; maybe that could have been the problem.