richfitz / storr

:package: Object cacher for R
http://richfitz.github.io/storr
Other
116 stars 10 forks source link

Consider not writing objects already in the cache #83

Closed wlandau closed 6 years ago

wlandau commented 6 years ago

Diving from $set() down to try_write_serialized_rds() for an RDS cache, I do not see anything to prevent the same object from being written twice.

cache <- storr::storr_rds("cache")
cache$set("really_large_data", really_large_data)
cache$set("really_large_data", really_large_data) # Can we just check the hash here and return?
richfitz commented 6 years ago

This happens at a higher level already:

> st <- storr::storr_rds(tempfile())
> st$set_value
function (value, use_cache = TRUE) 
{
    value_ser <- self$serialize_object(value)
    hash <- self$hash_raw(value_ser)
    if (!(use_cache && exists0(hash, self$envir))) {
        if (!self$driver$exists_object(hash)) {
            value_send <- if (self$traits$accept == "object") 
                value
            else value_ser
            self$driver$set_object(hash, value_send)
        }
        if (use_cache) {
            assign(hash, value, self$envir)
        }
    }
    invisible(hash)
}
<environment: 0x22d8040>

The self$driver$exists_object call here prevents the object being set a second time. This was one of the main things I wanted in storr in the first place!

Are you seeing some case where this is not happening?

wlandau commented 6 years ago

Hmm... I must have missed that before, sorry. It's clear to me now.