cncf / devstats.archive

📈CNCF-created tool for analyzing and graphing developer contributions
https://devstats.cncf.io/
Apache License 2.0
445 stars 147 forks source link

[bug] Investigate LFS (Large File Storage) usage within cncf/devstats and cncf/gitdm #397

Closed jeefy closed 1 year ago

jeefy commented 1 year ago

Currently, Devstats consumes quite a lot of GitHub's LFS offering (both in storage and in bandwidth). This doesn't seem right and should be looked into and, if possible, remedied.

image

cc @caniszczyk

lukaszgryglicki commented 1 year ago

This was approved a while ago and is only ebcause we are storing full history for JSON file changes, I can ytry to find a way to delete history for large files and only keep the most recent up to date version, I have no other ideas of how to deal with it.

caniszczyk commented 1 year ago

I think keeping the last few recent versions is probably fine, you wouldn't need more for debugging right?

On Tue, Mar 28, 2023 at 9:27 AM Ɓukasz Gryglicki @.***> wrote:

This was approved a while ago and is only ebcause we are storing full history for JSON file changes, I can ytry to find a way to delete history for large files and only keep the most recent up to date version, I have no other ideas of how to deal with it.

— Reply to this email directly, view it on GitHub https://github.com/cncf/devstats/issues/397#issuecomment-1487001040, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPSILWOCNZZEKXYCGA343W6LYOBANCNFSM6AAAAAAWKUXDIY . You are receiving this because you were mentioned.Message ID: @.***>

-- Cheers,

Chris Aniszczyk https://aniszczyk.org

lukaszgryglicki commented 1 year ago

Yeah, but I think we should keep some history, say up to 1 month, but I'm not sure how to do this - this requires rewriting history erach time I commit, not sure how to do this, will investigate on Friday or later. Is this OK?

caniszczyk commented 1 year ago

No rush to investigate - one idea is using another repo for history data?

On Tue, Mar 28, 2023 at 9:31 AM Ɓukasz Gryglicki @.***> wrote:

Yeah, but I think we should keep some history, say up to 1 month, but I'm not sure how to do this - this requires rewriting history erach time I commit, not sure how to do this, will investigate on Friday or later. Is this OK?

— Reply to this email directly, view it on GitHub https://github.com/cncf/devstats/issues/397#issuecomment-1487007126, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPSIL7NLQQMM2Y2XQUGLDW6LY3ZANCNFSM6AAAAAAWKUXDIY . You are receiving this because you were mentioned.Message ID: @.***>

-- Cheers,

Chris Aniszczyk https://aniszczyk.org

lukaszgryglicki commented 1 year ago

Yeah, I can make an archive of this (without LFS) and start a fresh one, it will fill up to this level in another 2 or 3 years only. makes sense? It will also make clonign a lot faster.

caniszczyk commented 1 year ago

that would be awesome, I think this is a better design tbh

On Tue, Mar 28, 2023 at 9:52 AM Ɓukasz Gryglicki @.***> wrote:

Yeah, I can make an archive of this (without LFS) and start a fresh one, it will fill up to this level in another 2 or 3 years only. makes sense? It will also make clonign a lot faster.

— Reply to this email directly, view it on GitHub https://github.com/cncf/devstats/issues/397#issuecomment-1487043194, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPSIOOXYUA53HWJBWIMXLW6L3LJANCNFSM6AAAAAAWKUXDIY . You are receiving this because you were mentioned.Message ID: @.***>

-- Cheers,

Chris Aniszczyk https://aniszczyk.org

lukaszgryglicki commented 1 year ago

So this will be simple, if I have acess to create/rename repos on cncf org - pls make sure I have and then I can do it on Friday.

jeefy commented 1 year ago

You have full admin access to the CNCF Org 🎉

lukaszgryglicki commented 1 year ago

I will now archive old ones, new ones are not using LFS right now (I will try to avoid) and I will be trying to update JSONs less frequently to save space & transfer, but we still need to process affiliations so this cannot be avoided.

lukaszgryglicki commented 1 year ago

I'm closing this issue and archiving this repository.