eyadgaran / SimpleML

Machine learning that just works, for effortless production applications
BSD 3-Clause "New" or "Revised" License
17 stars 5 forks source link

Lazy Caching Support ("Diskcache") #58

Open eyadgaran opened 3 years ago

eyadgaran commented 3 years ago

Definition: reproducible outputs that are helpful to cache (eg probabilities, confusion matrix, etc)

Why not use artifacts? artifacts must be created during training to save. cached objects may not be created until much later when they are actually necessary.

implementation: only enable for structured inputs (need to be able to map input to cached output in order to be useful) like split objects or persistables

eyadgaran commented 2 years ago

could be coupled with #110 to guarantee inspectable types for function inputs

eyadgaran commented 2 years ago

pseudocode idea

def diskcache(save_pattern: str) -> Callable:
         def register(fn: Callable) -> Callable:
            register_cache_pattern(
                fn=fn.__name__, save_pattern=save_pattern
            )

            return partial(cache_wrapped_method, fn=fn)

         return register

def cache_wrapped_method(fn, *args, **kwargs):
       try: retrieve cache
       except: process, cache

@diskcache(default_save_pattern)
def expensive_operation(persistable, *args, **kwargs):
      ....