tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
443 stars 64 forks source link

[Blackhole bringup] Wider addresses in NOC cmd buffer programming sequence #8704

Closed abhullar-tt closed 4 months ago

abhullar-tt commented 5 months ago

TODO:

abhullar-tt commented 5 months ago

FYI @pgkeller, It doesn't look like any drastic changes introduced for BH noc_nonblocking_api. A lot of the changes to BH version needed because BBE and Metal versions of WH noc_nonblocking_api.h have diverged with BH version being copied directly from BBE.

I am assuming that the other tensix files have also diverged. I can go through those as well and make sure our BH versions are incremental steps from WH versions rather than copied of BBE variants. Should we consider going through a commonization exercise (between Metal and BBE) and add these files to UMD?

API additions/removals in BH noc_nonblocking_api to match diverged Metal WH noc_nonblocking_api

API changes

ncrisc_noc_fast_read

ncrisc_noc_fast_write

Add ncrisc_noc_fast_write_loopback_src where dest_addr split between NOC_RET_ADDR_LO/MID/HI

ncrisc_noc_blitz_write_setup

Remove transaction_id from:

noc_init (rename from ncrisc_noc_init and port Metal WH version)

Port WH version of ncrisc_noc_full_sync

noc_fast_posted_write_dw_inline

noc_fast_atomic_increment

abhullar-tt commented 5 months ago

@davorchap do you anticipate needing to address over 4 GB? If not then we don't need the higher bits of src_addr/dest_addr and can ignore writing to *_MID/*_HI registers.

@pgkeller on WH we do write to *_LO and *_MID on WH, eg:

  NOC_CMD_BUF_WRITE_REG(noc, cmd_buf, NOC_TARG_ADDR_LO, (uint32_t)src_addr);
  NOC_CMD_BUF_WRITE_REG(noc, cmd_buf, NOC_TARG_ADDR_MID, src_addr >> 32);
abhullar-tt commented 4 months ago

Related to noc_nonblocking_api: seeing a hang in ncrisc_noc_reads_flushed

abhullar-tt commented 4 months ago

Related to noc_nonblocking_api: seeing a hang in ncrisc_noc_reads_flushed

this was due to bug in setting a HI register (value needed to be right shifted by an additional 4 bits)