Closed normanheckscher closed 7 years ago
Well I like it but I just discovered that something about the latest TensorFlow(which I have to use in order to use "FileWriter" in your patch) slows down my compute speed by a factor of 5. What used to take about 0.6 seconds per count is now 3+ seconds per count. Anyone know what the cause of this is? I verified it's not your patch causing this so no worries there. I'm going to try to find a version that still has the FileWriter attribute but doesn't slow my machine down like that. It's already going to take 3 weeks to train this set so 15+ weeks isn't gonna work.
@TheOncomingStorm I just tried moving the FileWriter
call outside the batch loop so that it was called on each epoch. No noticeable difference from my perspective. How did you ascertain that this is the cause of slowing down your learning?
I discovered it because the original word-rnn code run under TF 0.12 without your changes runs slow as well. I've no idea why this started but a 5x slowdown seems extreme. I just recompiled TF 0.11 and with the code(again, minus your changes) is 5x faster than 0.12.
I am using the pre-compiled version of 0.12 so I'm gonna compile it really quick and see if that is the issue but I'm pretty sure the 0.11 version on that machine was pre-compiled so I have no idea what the cause is yet. I'll report back in a bit.
Well thanks to a couple changes with zlib(they updated last night) and a couple other things, I'll be waiting till everyone gets the code stabilized because it won't even configure right now.
Never mind. I thought I had upgraded to the GPU version but apparently it went back to the CPU-only version. I'm back to my normal speed now, and I can restart per your changes. Thanks a ton.
No worries. Glad it works for you. The code isn't perfectly clean and is a bit of a spaghetti hack. I'm looking at a few further improvements when I get some time away from reading papers.
@normanheckscher @TheOncomingStorm Thanks for discussion and wonderful code. May I add you as developers of this repos?
Sure, I'll be glad to test and report what I can.
Begin logging with TensorBoard. Starting with cost.