Open venpopov opened 2 months ago
Thanks for this feedback @venpopov!
Let's use this issue to collect thoughts on this type of change. There isn't a great workaround right now for your particular use case because it is hard for folks to manually check the hash themselves ahead of writing; pins:::pin_hash()
is both unexported and uses paths
as the arg, which isn't entirely easy for a user to get at.
I was surprised by this behavior:
Created on 2024-04-22 with reprex v2.1.0
In the third step, I am saving the same object as in the first. Normally, if it is the same as the most recent, it is not rewritten.
I would like to use pins to track data objects produced by a research pipeline, in which I might change branches to try out different features. With the current behavior, a new object will be saved every time I rerun a pipeline after switching branches, which is unnecessary file duplication.
To fix this (which could be through an option setting), pin_write should check the hash not only for the last version, but for all cached versions. Then to make sure that pin_read() will work correctly, it would need to update the "created" field (or perhaps a new "reactivated" field?) so that the appropriate version is considered the most recent.