analogdevicesinc / ai8x-synthesis

Quantization and Synthesis (Device Specific Code Generation) for ADI's MAX78000 and MAX78002 Edge AI Devices
Apache License 2.0
55 stars 47 forks source link

Synthesis error in the Camvid example #296

Closed kirilllzaitsev closed 1 year ago

kirilllzaitsev commented 1 year ago

Hi, running the train-quantize-eval-synthesize pipeline for the Camvid example, I encountered an error on the synthesis stage:

Arranging weights... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100%
Storing weights...   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100%
Creating network...  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0%ERROR: Processor 0: Layer 0 output for CHW=0,0,64 is overwriting input at offset 0x00400700 that was created by the input loader.
Creating network...  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0%

Network-related files (ai85net-unet.py, camvid-unet-large.yaml, camvid-unet-large-fakept.yaml) and the Camvid dataset are unchanged. These are the commands I use:

# train
python train.py --lr 0.001 --optimizer adam --epochs 5 --batch-size 4 --gpus 0 \
--deterministic --compress policies/schedule.yaml --qat-policy \
policies/qat_policy_camvid.yaml --model ai85unetlarge --dataset CamVid_s352_c3 \
--use-bias --wd 0 --out-fold-ratio 4 --truncate-test  \
--device MAX78000 \
--out-dir ai85unetlarge_artifacts/train

# quantize
python quantize.py ai85unetlarge_artifacts/train/last/best.pth.tar ai85unetlarge_artifacts/train/last/best-q.pth.tar \
 --device MAX78000 -v
(In the *-fakept.yaml case, I add:
python izer/add_fake_passthrough.py --input-checkpoint-path ai85unetlarge_artifacts/train/last/best-q.pth.tar --output-checkpoint-path ai85unetlarge_artifacts/train/last/best-q-pt.pth.tar --layer-name pt --layer-depth 56 --layer-name-after-pt upconv3
)

# eval
python train.py --model ai85unetlarge --dataset CamVid_s352_c3 --truncate-test --out-fold-ratio 4 --evaluate \
--save-sample 1 \
--exp-load-weights-from ai85unetlarge_artifacts/train/last/best-q-pt.pth.tar -8 \
--device MAX78000 \
--use-bias \
--batch-size 2 \
--out-dir ai85unetlarge_artifacts/eval

# synthesize
python ai8xize.py --test-dir synthed_net --prefix ai85unetlarge --checkpoint-file \
 ai85unetlarge_artifacts/train/last/best-q.pth.tar --config-file networks/camvid-unet-large.yaml \
--sample-input ai85unetlarge_artifacts/eval/sample_CamVid_s352_c3.npy \
--device MAX78000  \
--compact-data --mexpress --timer 0 --display-checkpoint --verbose --overwrite --board-name FTHR_RevA
MaximGorkem commented 1 year ago

Hi,

In this model, you should use --overlap-data as it is shown in the related line of the "gen-demos-max78000.sh":

python ai8xize.py --test-dir $TARGET --prefix camvid_unet --checkpoint-file trained/ai85-camvid-unet-large-fakept-q.pth.tar --config-file networks/camvid-unet-large-fakept.yaml $COMMON_ARGS --overlap-data --mlator --no-unload --max-checklines 8192 --new-kernel-loader "$@"

In this case, since the resolution (88x88) consumes nearly the entire data memory, there is no way of writing output without any overlap. However, by indicating an offset before the input we make sure that the input data is already consumed before being overwritten.

github-actions[bot] commented 1 year ago

This issue has been marked stale because it has been open for over 30 days with no activity. It will be closed automatically in 10 days unless a comment is added or the "Stale" label is removed.