Wrap up device memory manager to avoid unnecessary memory allocation

This PR closes #5. It implements a wrapper upon the device memory manager that only calculates the memory consumption when a tensor is created and allocates the actual physical memory only when necessary (i.e., profiling and fingerprint checking).

It also helps reduce side effects of graph operator generation, which is important for parallelizing the search procedure.

TODO:

[ ] Free memory after profiling and fingerprint checking.
[ ] Pass existing examples.

mirage-project / mirage

Wrap up device memory manager to avoid unnecessary memory allocation #19