Closed zygmuntz closed 10 years ago
The Learner reports the GPU memory at the end of each line. Its showing just 3% left after the first pass. So its definitely going to run out of memory on the second pass. The GPU has no native GC and allocation is so expensive its probably not practical to use one. Instead we use a cache, which you can clear manually.
Type:
resetGPU; Mat.clearCaches to clear the cache and the GPUs allocator
We may automate this inside the Learner in the next release. The downside is that it will clear any other arrays already residing in the GPU's memory.
I go through the quickstart example on Windows 7. When I try to call
mm.train
the second time, I get the following error. I need to exit bidmach and run it anew to be able to train again.scala> mm.train corpus perplexity=5582,125391 pass= 0 2,00%, ll=-0,693, gf=0,116, secs=6,7, GB=0,02, MB/s= 2,86, GPUmem=0,03 16,00%, ll=-0,134, gf=0,630, secs=15,0, GB=0,12, MB/s= 8,10, GPUmem=0,03 30,00%, ll=-0,123, gf=0,825, secs=21,9, GB=0,22, MB/s=10,16, GPUmem=0,02 44,00%, ll=-0,102, gf=0,930, secs=28,7, GB=0,33, MB/s=11,31, GPUmem=0,02 58,00%, ll=-0,094, gf=0,995, secs=35,6, GB=0,43, MB/s=12,04, GPUmem=0,02 72,00%, ll=-0,074, gf=1,040, secs=42,4, GB=0,53, MB/s=12,49, GPUmem=0,02 87,00%, ll=-0,085, gf=1,075, secs=49,1, GB=0,63, MB/s=12,89, GPUmem=0,02 100,00%, ll=-0,069, gf=1,097, secs=55,8, GB=0,73, MB/s=13,02, GPUmem=0,02 Time=55,8000 secs, gflops=1,10
scala> mm.train corpus perplexity=5582,125391 java.lang.RuntimeException: CUDA alloc failed initialization error at BIDMat.GMat$.apply(GMat.scala:1094) at BIDMat.GMat$.newOrCheckGMat(GMat.scala:1780) at BIDMat.GMat$.newOrCheckGMat(GMat.scala:1814) at BIDMat.GMat$.apply(GMat.scala:1100) at BIDMach.models.RegressionModel.init(Regression.scala:29) at BIDMach.models.GLM.init(GLM.scala:25) at BIDMach.Learner.init(Learner.scala:37) at BIDMach.Learner.train(Learner.scala:45) at .(:26)
at .()
at .(:7)
at .()
at $print()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734)
05) at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:717) at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:581) at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:588) at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:591) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILo op.scala:882) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scal a:837) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scal a:837) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClass Loader.scala:135) at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837) at scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala :83) at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:96)