pytorch / executorch

On-device AI across mobile, embedded and edge for PyTorch
https://pytorch.org/executorch/
Other
2.2k stars 368 forks source link

Qualcomm AI Engine Direct - Optimize memory usage at runtime #7003

Closed shewu-quic closed 10 hours ago

shewu-quic commented 1 day ago

Qnn backend doesn't need processed data after qnn_context_create_from_binary.

pytorch-bot[bot] commented 1 day ago

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7003

Note: Links to docs will display an error until the docs builds have been completed.

:heavy_exclamation_mark: 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

:white_check_mark: No Failures

As of commit f1fe34b11a6b5b79bfd0ef9bf4779760d7feb7c3 with merge base a39ea29ba0b8dabb1e6d133386bf83000743da78 (image): :green_heart: Looks good so far! There are no failures yet. :green_heart:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot commented 1 day ago

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

shewu-quic commented 1 day ago

Hi @cccclai,

The PR is to fix the issue of memory usage. Could you please help take a look?

Thanks, Hutton

shewu-quic commented 1 day ago

I think spill fill buffer is still necessary for 8b due to the memory usage in HTP. This change only reduced PSS not dma buffer.