analogdevicesinc / ai8x-synthesis

Quantization and Synthesis (Device Specific Code Generation) for ADI's MAX78000 and MAX78002 Edge AI Devices
Apache License 2.0
55 stars 47 forks source link

processors #344

Open fzh-adham opened 2 weeks ago

fzh-adham commented 2 weeks ago

hi I have trouble with the logic of processors in yaml file for example HWC (little data) configuration for CIFAR-100 Simple Model

arch: ai85ressimplenet dataset: CIFAR100

layers: Layer 0

out_offset: 0x2000
processors: 0x7000000000000000
operation: conv2d
kernel_size: 3x3
pad: 1
activate: ReLU
data_format: HWC

Layer 1

out_offset: 0x0000
processors: 0x0ffff00000000000
operation: conv2d
kernel_size: 3x3
pad: 1
activate: ReLU

Layer 2 - re-form data with gap

out_offset: 0x2000
processors: 0x00000000000fffff
output_processors: 0x00000000000fffff
operation: passthrough
write_gap: 1

Layer 3

in_offset: 0x0000
in_sequences: 1
out_offset: 0x2004
processors: 0x00000000000fffff
operation: conv2d
kernel_size: 3x3
pad: 1
activate: ReLU
write_gap: 1

Layer 4 - Residual-1

in_sequences: [2, 3]
in_offset: 0x2000
out_offset: 0x0000
processors: 0x00000000000fffff
eltwise: add
operation: conv2d
kernel_size: 3x3
pad: 1
activate: ReLU

Layer 5

out_offset: 0x2000
processors: 0xfffff00000000000
output_processors: 0x000000fffff00000
max_pool: 2
pool_stride: 2
pad: 1
operation: conv2d
kernel_size: 3x3
activate: ReLU
" why the sample doesn't start to turn on the processors from the first(right)??what is the logic behind this??
rotx-maxim commented 2 weeks ago

Why the sample doesn't start to turn on the processors from the first(right)??what is the logic behind this??

The reason to distribute the processors is memory allocation. It's not needed for all networks, but when you look at the "Kernel map" at the top of the synthesis log.txt for your example, you can see how kernel memory is associated with processors. The data memory is also associated with processors. When several layers uses fewer than "all" processors, it can be useful for resource allocation to distribute the processors used.

fzh-adham commented 2 weeks ago

Thanks How does developer understand to set the mapping of processors in each layer of model?

On Mon, 17 Jun 2024, 17:22 Robert Muchsel, @.***> wrote:

Why the sample doesn't start to turn on the processors from the first(right)??what is the logic behind this??

The reason to distribute the processors is memory allocation. It's not needed for all networks, but when you look at the "Kernel map" at the top of the synthesis log.txt for your example, you can see how kernel memory is associated with processors. The data memory is also associated with processors. When several layers uses fewer than "all" processors, it can be useful for resource allocation to distribute the processors used.

— Reply to this email directly, view it on GitHub https://github.com/analogdevicesinc/ai8x-synthesis/issues/344#issuecomment-2173452733, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWLZTJKRAICR3MZ6QUMDEMDZH3SYPAVCNFSM6AAAAABJMVNJCCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZTGQ2TENZTGM . You are receiving this because you authored the thread.Message ID: @.***>

rotx-maxim commented 2 weeks ago

The number of processors required is set by the number of channels in the layer. The exact selection of which processors to use for a layer is not fixed and there may be several valid answers for any given layer. Ultimately, you may want to modify the processor assignment to optimize resource allocation.

fzh-adham commented 6 days ago

hi thanks for reply yes there are several valid answers but in the samples I have seen for some models like ai85net5 which has 5 layers in yaml file most of the processors are on but there are only 3 channel which used and I don't understand why some unused processors are on .. for example arch: ai85net5 dataset: MNIST

Define layer parameters in order of the layer sequence

layers:

ermanok commented 14 hours ago

The number of processors are set according to the number of the input channels of the layer. AI85Net5 model is defined here and as you can observe there are 4 convolutional and 1 linear layers. The input channels of the convolutional layers are 1, 60, 60, 56 this is why in the yaml file sets 1 (0x0000000000000001), 60 (0xfffffffffffffff0), 60 (0xfffffffffffffff0) and 56 (0x0ffffffffffffff0) processors. The linear layer is ran after flattening the input and it has 12 length features. So, the linear layer uses 12 (0x0000000000000fff) processors.