tensorflow / fold

Deep learning with dynamic computation graphs in TensorFlow
Apache License 2.0
1.82k stars 266 forks source link

How can I run fold codes on GPU? #61

Open jueliangguke opened 7 years ago

jueliangguke commented 7 years ago

I have run the treelstm codes in sentiment.ipynb on CPU successfully. But when I want to run the same code on GPU, I get errors as follow:

InvalidArgumentError (see above for traceback): Cannot assign a device to node 'Adagrad/update_word_embedding/weights/Unique': Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/GPU:0' Colocation Debug Info: Colocation group had the following types and devices: SparseApplyAdagrad: CPU UnsortedSegmentSum: GPU CPU StridedSlice: GPU CPU Shape: GPU CPU Unique: GPU CPU VariableV2: GPU CPU Const: GPU CPU [[Node: Adagrad/update_word_embedding/weights/Unique = UniqueT=DT_INT32, _class=["loc:@word_embedding/weights"], out_idx=DT_INT32]]

pklfz commented 7 years ago

check #10

delesley commented 7 years ago

Make sure tensorflow is compiled with GPU support, and that it is using the GPU -- you can find instructions on the tensorflow docs. If tensorflow can find the GPU, then fold will automatically use the GPU.

On Thu, May 18, 2017 at 12:31 AM, jueliangguke notifications@github.com wrote:

I have run the treelstm codes in sentiment.ipynb but I'm not sure whether the model was trained on GPU. My GPU is K40 and now I find it takes about 70 seconds to train treelstm on SST for one epoch? How can I make Fold codes run on GPU?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/fold/issues/61, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGbTQpAPirKEA6ibjZoiR9kN6bRVGooks5r6_O2gaJpZM4Ne1gb .

-- DeLesley Hutchins | Software Engineer | delesley@google.com | 505-206-0315

jueliangguke commented 7 years ago

Thanks to pklfz! The solution by revising tensorflow_fold/blocks/layers.py from acelove in #10 works. Before this, I have found another solution in this page https://www.tensorflow.org/versions/r1.0/api_docs/python/tf/Graph. with following codes:

with g.device('/gpu:0'): with g.device(None): graph definition lines

def matmul_on_gpu(n): if n.type == "MatMul": return "/gpu:0" else: return "/cpu:0"

with g.device(matmul_on_gpu): training lines

It looks the model runs lighter faster with the first solution. (23s vs 28s /epoch, 50~60s/epoch on CPU)