Open jdolitsky opened 4 years ago
I'm interested in this ticket. It's something I could do out of band, but the missing piece is I don't think there's anything in chartmuseum that tracks last used for charts. Does this data currently exist? If we were to add it, where would it be stored?
@cep21 Happy with your interests , maybe this is what you find ? If you can help us with this feature , it will make a big sense :)
@scbizu I think "last read" makes more sense than "last modified", right? You would want to remove charts people are no longer downloading, for example.
@cep21 You're right , but the storage itself do not support the last read timestamp yet . It will be a huge PR if we add a new mechanism to the storage structure , we should handle every REQ from helm pull
, and update the last read timestamp.
I think this is why this issue still tag with help wanted
XD
One idea is to store this information inside redis, for example. Another is to use the backend storage itself to store this information and "sync" some kind of ledger every 60 seconds (for example).
I prefer the second one , the AutoPurge should be provided as an interface function , and it can be differ from the real storage backends. The expiration duration should be configurable if users open the auto purge feature .
and "sync" some kind of ledger every 60 seconds
This could be tricky if people are running multiple chartmuseum instances for redundancy, since we'll have to merge ledgers
It'd be great if we could have more than one conditions to decide whether to delete the charts or not. E.g., Instead of "Delete all the charts older than 2 months", it would be better if we have "Delete all the charts older than 2 months matching a particular regex". This is because you might not want to delete release (e.g., 2.1.0
) charts but if you want to delete pre-release charts (e.g., 2.1.0-custom-fix
or 2.1.0-pr-3245
), you could use regex to match all the pre-release charts older than specified time period (Check #383 ).
However, one thing that concerns me about regex is you'd want to test it out first to see which charts would be deleted with that particular regex to avoid deleting charts you didn't intend to delete.
The thread is too long to track information now . And let me draw the conclusion till now , the key points list below:
last read
(or can be fallback to last modified
?)dry-run
, it logs the will-be-purged chart before purge the chart.Above looks right. Bullet (3) is probably an enhancement off the core request: bullet (1). Also the difference between last read
and last modified
is pretty huge in both use and ease-of-implementation.
Optional: should provide a flag to determine whether users need to soft delete the chart. (soft delete here means not really purges the chart but logs the will-be-purged charts)
Nitpick: I think it'd be better to call it dry-run
instead. soft delete
makes me think that the chart would be archived or removed from the index but the data would still be there but what we want to show using soft delete
(as far as I've understood) is what's going to be deleted if you run a delete operation.
Everything else looks good to me @scbizu
I will be self-assigned to draft one implementation this weekend since I think it will make a big sense for decreasing the pressure of index so that we can both decrease the latency of our APIs and save the disk size of chart storage .
Maybe it will be provides with --per-chart-max-version
, it will keep the latest N charts as your configuration. However, since I do not know which chart is currently be used , we can add more stuffs (like stick some charts so that they will not be removed from storage) later .
(The dry-run option is already implemented inside our company maybe I can open source it later)
(off-topic: Our CI failed again because of too large index refreshing XD)
Where was this left off? I'm willing to try picking up the remaining work. We have 2.5k charts, and would like to purge as many as possible
@jasondamour This is already implemented , you can use the version in our HEAD and try the -per-chart-max-version
option to start chartmuseum .
Add feature flags that enable auto-removal of old chart versions in storage based on various age / last used / version parameters