Merge ttnn with tt_eager

eyonland commented 1 month ago

Goal

Streamline further development of our OP library.

Background

Operations are split between tt_eager and ttnn. Many ops in ttnn call their tt_eager implementation. Some ops like unary and binary duplicate lots of code leading to 2 different pathways in C++ / Python. This not only confuses new contributors, but increases time required to make changes. For example, as we add support for queue_id and output_tensor - we have to add them to ops in both libs.

We want to merge the two, moving code from tt_eager to ttnn, consolidating OPs in one place.

Guiding principles

Keep everything related to a single op at a single place
Code organization reflects ownership
User documentation and code organization should be aligned

Milestones

[x] Phase 1 - Smoke out blockers and Align - June 12
[x] Phase 2 - Pave the way - June 12 - June 19
[ ] Phase 3 - Scale migration of ops - June 19 - Aug 17 (in progress now)
[ ] Phase 4 - Remove tt_lib - July
[ ] Phase 5 - Remove tt_eager - mid Aug

Example instead of 1000 words

See how files are consolidated in ttnn, instead of being split between tt_eager and ttnn. This is the core of the change this effort drives.

From 😫

tt_eager
└── tt_dnn
    └── op_library
        ├── ...
        └── eltwise_binary
            ├── kernels
            │   ├── compute
            │   │   └── eltwise_binary.cpp
            │   └── dataflow
            │       └── reader_binary_interleaved_start_id.cpp
            ├── multi_core
            │   └── eltwise_binary_op_multi_core.cpp
            ├── eltwise_binary_op.cpp
            └── eltwise_binary_op.hpp
ttnn
└── cpp
    ├── pybind11
    │   └── operations
    │       ├── ...
    │       └── binary.hpp
    └── ttnn
        ├── op_library
        │   └── ...
        │   └── binary
        │       ├── binary_op.hpp
        │       └── binary_op.cpp
        └── operations
            ├── ...
            └── binary.hpp

to 😎

ttnn
└── cpp
    └── ttnn
        └── operations
            └── ...
            └── eltwise
                └── binary
                    ├── device
                    │   ├── kernels
                    │   │   ├── compute
                    │   │   │   └── eltwise_binary.cpp
                    │   │   └── dataflow
                    │   │       └── reader_binary_interleaved_start_id.cpp
                    │   ├── binary_op.cpp
                    │   ├── binary_op.hpp
                    │   ├── binary_program_factory.cpp
                    │   └── binary_program_factory.hpp
                    ├── binary.cpp
                    ├── binary.hpp
                    ├── binary_pybind.cpp
                    └── binary_pybind.hpp

All files related to an Operation are in its own folder. Binary op gets its own folder and owner.

binary.hpp - Op interface and registrations
binary_pybind.hpp - bindings
binary_program_factory.hpp - kernel setup
binary_op.hpp/.cpp - operation itself with validation/program selection/etc

Host and Device "programs" are split. Host program can become a "gold" for comparison in tests.

Scope (tbd)

[x] #9609
[x] #10467
[ ] #10532
[x] #10531
[x] #9377
[x] #9490
[x] #9492
[ ] #9493
[x] #9486
[ ] #9489
[ ] #9487
[ ] #9488
[ ] #9527
[x] #10137
[x] #9911
[x] #9491
[x] #9733
[ ] #9785
[x] #10378
[ ] #10163
[x] #9628
[x] #9874
[x] #9871
[ ] Move Moreh ops to ttnn
[x] Merge tt_dnn into ttnn
[x] Merge tt_lib
[x] Merge tt_eager

might want to prioritize migration of these:

[ ] Move bcast to ttnn with output_tensor and queue_id
[x] Move permute to ttnn with output_tensor and queue_id
[x] Move recip to ttnn with output_tensor and queue_id
[ ] Move unpad to ttnn with output_tensor and queue_id

ttnn/ttnn/api.rst

ayerofieiev-tt commented 1 month ago

Considerations

During the discussion about the structure we considered:

1. `A) Group bindings` vs `B) Group everything related to a given operation`

During changes to operation interface it is convenient when its binding is nearby
Changes to interface are frequent at this point
We want owner to maintain op end to end and do not plan to have a separate owner just for bindings

2. `A) Related operations split into own files` vs `B) Related operations in the same file`

Split makes it easier to locate / navigate a specific operation
Split highlights dependencies
Split leads to more files and deeper structures :(
There are plenty of examples where split does not make sense and we need to adequately accommodate those

SeanNijjar commented 3 weeks ago

Question in terms of pipeline setup.

We've currently got tests invoked in run_tt_eager.py and I'm assuming we'll keep the test invocations in there but update the test file paths (as they are migrated) and finish the migration process by simply renaming run_tt_eager.py

yan-zaretskiy commented 2 weeks ago

As I am porting my first op, I realize that I take issue with the <op>_program_factory.hpp naming convention. We don't write a factory here. A factory has a very specific meaning, which is an object that creates other objects. Here we create a file that defines programs themselves. So it should be renamed to <op>_programs.hpp to reflect what is actually inside.

tenstorrent / tt-metal