tensorflow / minigo

An open-source implementation of the AlphaGoZero algorithm
Apache License 2.0
3.47k stars 561 forks source link

Decouple the conv data format from the input feature layout #974

Open tommadams opened 4 years ago

tommadams commented 4 years ago

On the one hand, it's more efficient for the CPU to generate NCHW input features. On the other, TensorFlow supports NHWC convolutions on a wider variety of platforms.

When I added support for NHWC, I coupled the conv data format to the input feature layout.

We should add another option to support different tensor layouts for input features and convolutions, inserting transpose operations as necessary.

brilee commented 4 years ago

Layout also has a performance impact on TPU, so you probably want to consider that as well.

On Thu, Mar 12, 2020 at 5:09 PM Tom Madams notifications@github.com wrote:

On the one hand, it's more efficient for the CPU to generate NCHW input features. On the other, TensorFlow supports NHWC convolutions on a wider variety of platforms.

When I added support for NHWC, I coupled the conv data format to the input feature layout.

We should add another option to support different tensor layouts for input features and convolutions, inserting transpose operations as necessary.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/minigo/issues/974, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKCKFRYJLSCTEI4ZPFL533RHFFQ3ANCNFSM4LGVNNGQ .

tommadams commented 4 years ago

TPU apparently doesn't support NCHW conv at all...

SHKD13 commented 4 years ago

Is it purely theoretical topic or the part of MiniGo v18 building process?

tommadams commented 4 years ago

This is more for MLPerf, which uses smaller models and has a much higher CPU:GPU compute load than a regular Minigo run.

SHKD13 commented 4 years ago

MLPerf is a kind of a "lab", where you can try some algorithm's changes and improvements on a smaller Networks before implement it for a regular full sized run? Sorry, if I get something wrong :)

tommadams commented 4 years ago

MLPerf is a suite of machine learning benchmarks that many major tech companies collaborate on. Minigo is the reinforcement learning benchmark for MLPerf.

tommadams commented 4 years ago

See https://github.com/tensorflow/minigo/tree/master/ml_perf for the implementation. It's basically a smaller & simplified version of the full pipeline

SHKD13 commented 4 years ago

Thanks for clarification and links! Now I can see that I didn't know about the MLPerf background of MiniGo. I thought, MG is enthusiastic and independent Go project like LZ or KataGo. But missed the point about something bigger behind, which drives the MiniGo team