Closed njam closed 10 years ago
I like the idea. However the cache should not be on the class object. Rather, the cache should somehow be scoped to the environment (the semiglobal context object).
A-ha, makes sense!
Should there be a environment.fetch_git_hash(repository, uri, ref)
or a generic key-value-store for the environment like environment.cache_get(key, lambda)
?
Other ideas?
I went ahead and introduced a environment.cache(key, getter)
.
Let me know what you think about this solution (and the naming).
I renamed the function to environment.runtime_cache(key, getter)
, might be a bit clearer?
Something like this? :)
@yfeldblum I merged in upstream, and added remote-fetching if the given sha is not found in the current repo's log. What do you think?
(somehow travis doesn't post the build status, here it is: https://travis-ci.org/njam/librarian/builds/18083075)
Sweet.
Please use the class name as keyspace, and use the keyspace
method (new).
def rtc
@rtc ||= environment.runtime_cache.keyspace(self.class.name)
end
Please use once
instead of memo
(new) when we don't need the return value of the block.
def fetch(remote)
rtc.once { fetch_real(remote) } # or inline with do...end
end
(fetch_hash
should still use memo
because we need the return value of the block.)
Please avoid increasing the git ops count in all cases.
Please rebase & squash.
Thanks for the feedback, I pushed a new version!
I introduced a "keyspaced" method runtime_cache
, or did you want it to be called rtc
actually?
Regarding the change in git ops:
I think in these two cases (sha is provided and already checked out) I can't keep the 3 instead of 4 ops.
But: These are only local operations, the far slower fetch
will be executed less often instead. So overall the speed improvements should be significant.
Previously when a given sha was not checked out, it would just fetch
from the remote. Now in that scenario I first want to check whether that sha is already in the repo's history, to know whether it makes sense to fetch
or not.
Do you agree, or am I missing something?
Before:
/usr/local/bin/git reset --hard --quiet
/usr/local/bin/git clean -x -d --force --force
/usr/local/bin/git rev-parse HEAD --quiet
After:
/usr/local/bin/git reset --hard --quiet
/usr/local/bin/git clean -x -d --force --force
/usr/local/bin/git log -1 --no-color --format=tformat:%H ebd4a74e26c18b65cd966137dc3296c4091cbba5
/usr/local/bin/git rev-parse HEAD --quiet
... looks like public_send
was introduced only in ruby 1.9.1. I now included this "backports" gem as suggested by this stackoverflow answer for ruby 1.8.7 compatibility.
What do you think?
(again, travis: https://travis-ci.org/njam/librarian/builds/18103140)
Fixed.
Obviously I should have had a test case for that so Travis would catch it early!
reset & clean can be done just once per process to reduce ops count.
Thank you for the great work and all the tests, much appreciated!
I noticed that
librarian
fetches theorigin
multiple times for the same repo, because both the git-"source" and the git-"repository" are new objects for every dependency.With attached code I introduced a git-"source"-wide cache for which "uri+remote+ref" combinations have been fetched and what the current "sha" is.
This speeds up installs and updates substantially if multiple dependencies come from the same git repo.
Please review.