liuliu / ccv

C-based/Cached/Core Computer Vision Library, A Modern Computer Vision Library
http://libccv.org
Other
7.07k stars 1.71k forks source link

Woah, cant believe this exists #238

Closed brappier closed 1 year ago

brappier commented 1 year ago

Hi Liu,

Thank for the amazing work. I was looking into these hacker friendly frameworks, and this seems the best (better than tinygrad too)

Would you be able to give a brief overview on how it works, esp the MPS backend. You dont seem to build a static MPS graph.

liuliu commented 1 year ago

It uses the same trick as PyTorch. We build fine-grain ops with MPSGraph and cache individual ops by input shapes / type so next execution of the same kernel will be faster.

brappier commented 1 year ago

@liuliu how does it know when to clear ops from the cache?

and what is graph.workspaceSize ?

liuliu commented 1 year ago

I think graph.garbageCollect() will trigger it ccv_nnc_dynamic_graph_gc.

workspaceSize is the size you set for CUDA. Certain CUDNN ops require scratch space, and this set the maximum scratch space for these ops.

brappier commented 1 year ago

okay so i manually call graph.garbageCollect() to clear the graph cache?

liuliu commented 1 year ago

Yeah, or when you have memory pressure, it will call automatically (but mostly worked for CUDA, since for CUDA, you will just encounter allocation failure, at that point we will do clean up, for Apple platform, the allocation failure may not happen due to overcommit). On Apple platforms, suggest you to hook that call into memory pressure notifications.