Open awearytraveler opened 6 years ago
Unfortunately we were never able to get Feudal Networks to converge successfully. I've talked to some other researchers that have had difficulty reproducing the paper as well. Its likely that there are a few implementation differences as the paper doesn't fully describe the architecture.
I met same problem, loss doesn't decrease.
Hello! I'm very sorry to trouble you! I want to ask whether the mode (tmux, child) make a great influence in training. Because I have trouble in tmux and I don't know why.
Does this code achieve the benchmarks given in the paper? I modified this to work on my system, but it doesn't converge after running it for a few days.