Open dmakoviichuk-tt opened 2 weeks ago
Hey Denys, this is something that has come up before. I haven't had the chance to look into optimizing it but there definitely is room for optimizations in trying to deduce where the block to free should be inserted back into the free list.
hi @abhullar-tt I think current freelist allocator could be slightly optimized overall but algo will remain the same. Not sure if it help a lot.
Member
We previously talked about exploring something like: http://www.gii.upv.es/tlsf/
@abhullar-tt looks interesting. But as a first step I think it is good to optimize existing one to make sure we don't add new issues.
@abhullar-tt looks interesting. But as a first step I think it is good to optimize existing one to make sure we don't add new issues.
Yes definitely agree and know there is room to optimize existing implementation
Describe the bug tt: :tt_metal::allocator::FreeList::deallocate takes ~5% of the total host time during nanogpt training.
To Reproduce Run nanogpt training.
Expected behavior It should be faster. Screenshots