Open brayanhenao opened 7 years ago
Not quite sure the problem about @mpgen as you mentioned. but it takes several days to run train.py with CPU tensorflow, GPU will definitely speed up the training into several hours.
you must support the details of platform which you use. such as python2, or python3, windows or fedora? i have met this problem under windows platform, python3. after look the document of python official document, i found that python multiproccess must be used under global case which is different from linux enviroment.
@frischzenger Hi!, i'm using Windows 10, Python 3.
@Weiguo2000 This is the problem that i'm talking about.
D:\number_plate_recog\deep-anpr-master>train.py
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "BestSplits" device_type: "CPU"') for unknown op: BestSplits
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "CountExtremelyRandomStats" device_type: "CPU"') for unknown op: CountExtremelyRandomStats
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "FinishedNodes" device_type: "CPU"') for unknown op: FinishedNodes
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "GrowTree" device_type: "CPU"') for unknown op: GrowTree
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "ReinterpretStringToFloat" device_type: "CPU"') for unknown op: ReinterpretStringToFloat
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "SampleInputs" device_type: "CPU"') for unknown op: SampleInputs
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "ScatterAddNdim" device_type: "CPU"') for unknown op: ScatterAddNdim
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TopNInsert" device_type: "CPU"') for unknown op: TopNInsert
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TopNRemove" device_type: "CPU"') for unknown op: TopNRemove
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TreePredictions" device_type: "CPU"') for unknown op: TreePredictions
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "UpdateFertileSlots" device_type: "CPU"') for unknown op: UpdateFertileSlots
Traceback (most recent call last):
File "D:\number_plate_recog\deep-anpr-master\train.py", line 267, in
D:\number_plate_recog\deep-anpr-master>Traceback (most recent call last):
File "
it looks like the problem you mentioned only occurs in Windows platform, I didn't have the problem in both ubuntu and Mac system.
Yes I back this up , on windows the multiprocessing fails , commenting @mpgen is the only way out, but on linux this works out of the box right .
would commenting out @mpgen make the training sequential? like if I used a CPU? It runs for me like that, though (not sure if slow or fast in comparison: time for 60 batches 301.733... and on later batches 64.23...)
Yup it makes it use only one core, instead of using all cores available. So even on a CPU this is slower as it is using only 1 core instead of 4 / n available.
Hi, as you know the @mpgen in train.py file is generating some errors, so i commented it. The problem is that it took really long time, i use only CPU tensor flow version (i don't have an NVIDIA gpu, only AMD). For 54k batches it took around 4 days. Did someone know how to fix the mpgen error? PS: Excuse my english.