Open hughperkins opened 7 years ago
Hi Hugh,
I was looking to extend the functionality of my hardware by using an opencl backend for tensorflow. I haven't extensively used DeepCL, even If I have managed to install it.
I found the wheel file for OSX very useful.
I think for now a big hurdle is to be keras 2.0 compatible or a pointer to keras 1.1.1 documentation will be good.
I have yet to test more functionality though - seems running the tensorflow examples fail on me but writing out the keras examples with a few tweaks work.
Dexter
I am using a Mac and I wanted to try the Tensorflow library. As I do have an AMD Graphic Card I could not use the GPU capabilities with the Tensorflow library. With this project I am able to run the Tensorflow examples and try the capabilities of the library. The working version is 0.11 however, the current version is 1.2. There are relatively many examples published which suppose newer version of the library and are not backwards compatible. I would like to ask you, how do you see the possibility to adapt and compile the latest version of the Tensorflow library so it works with the AMD graphic cards as well? Many thanks for info.
I'm looking for best performance of speech recognition libraries. Initially I've used NervanaSystems's neon. It is optimized, but I need more speed. I have geforce gt 710 and 2 PCs with intel hd4600 (core i3 4350) and intel hd 530 (core i5 6600). What is optimistic? Today after week of investigations I've tried tf-coriander with downgraded version of Mozilla's DeepSpeech repo (before commit "Upgrade to Tensorflow 1.0.0"). It worked, but results are both strange and not. When using 0.12 version of tensorflow from pip (cpu version) and reducing number of hidden units of DeepSpeech network from 2048 to 768, I have 7 seconds per epoch (core i3 4350). With intel hd4600 (same processor) I have 50 seconds per epoch (same network params as for cpu). As I know core i3 4350 has 10-20 gflops per core maximum, 2 cores) - 20-40 flops. hd4600 has 25-50 gflops, so I don't know why cpu version is 7 times faster than opencl version.
I'll try gpu version on intel hd530 asap. EDIT: I've run DeepSpeech on same network params at intel hd 530 (intel core i5-6600). Time of epoch decreased from 50 seconds to 30 seconds, but it's higher than time while training on CPU (7 seconds)
I think the memory of the intel gpus are on the low side so the time transfer from ram to video memory will take a hit?
I am looking to utilize many dozens of iMacs with AMD GPUs that could be used during non-office hours for training via Horovod
I found a way to use the GPUs! The pip package was awesome!!!! But it doesn't quite work for everything I need.
I'm missing Keras support, and horovod support. I'm hoping that just upgrading to the latest TF version would fix it? Not sure.
Can't use it yet....
Because of the features missing above.
Please provide a single post stating:
This is kind of an experimental approach to getting feedback :-) . But being starred or not doesnt give me much information on what people are looking for, whether they are finding it useful etc, so I'm going to try this approach :-)
Edit: note that I seem to have started adding :+1: to items to indicate I've read them. I probably wont reply into this thread. If you do want a reply, please consider raising a new issue, which I still might not reply to, but I might...