ezbz / gitlabber

Gitlabber - clones or pulls entire groups tree from gitlab
MIT License
472 stars 78 forks source link

Feature request: skip `archived` projects #47

Closed needleshaped closed 3 years ago

needleshaped commented 3 years ago

Dear @ezbz, thank you for the tool.

Currently Gitlabber processes all projects, regardless of archived status. Can we parameterize it and skip those? I believe it fits the original goal of the project perfectly.

Workaround is to actual move projects to dedicated group, and then ignoring it, e.g. -x '{.*(archived-project$).*}', which is suboptimal.

Please see https://docs.gitlab.com/ee/api/projects.html#list-all-projects: Attribute Type Required Description
archived boolean No Limit by archived status.

Logs:


$ mkdir TEMP && cd TEMP
$ time gitlabber --verbose -t sometoken -u https://gitlab.some.domain -i '/Test Group**' -p
root [https://gitlab.some.domain]
└── Test Group [/Test Group]
    ├── test-project-2-archived [/Test Group/test-project-2-archived]
    └── test-project-1 [/Test Group/test-project-1]

$ time gitlabber --verbose -t sometoken -u https://gitlab.some.domain -i '/Test Group**'
2021-02-24 11:04:43,263 - gitlabber.cli - DEBUG - verbose=[True], print=[False], log level set to [10] level
2021-02-24 11:04:43,275 - gitlabber.cli - DEBUG - Reading projects tree from gitlab at [https://gitlab.some.domain]
2021-02-24 11:04:43,275 - gitlabber.gitlab_tree - DEBUG - Loading projects tree gitlab server [https://gitlab.some.domain]
2021-02-24 11:04:43,278 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): gitlab.some.domain:443
2021-02-24 11:04:43,536 - urllib3.connectionpool - DEBUG - https://gitlab.some.domain:443 "GET /api/v4/groups HTTP/1.1" 200 None
...
2021-02-24 11:04:58,264 - urllib3.connectionpool - DEBUG - https://gitlab.some.domain:443 "GET /api/v4/groups/67/projects HTTP/1.1" 200 None
2021-02-24 11:04:58,409 - urllib3.connectionpool - DEBUG - https://gitlab.some.domain:443 "GET /api/v4/groups/66/projects HTTP/1.1" 200 None
2021-02-24 11:04:58,413 - urllib3.connectionpool - DEBUG - Resetting dropped connection: gitlab.some.domain
2021-02-24 11:04:58,651 - urllib3.connectionpool - DEBUG - https://gitlab.some.domain:443 "GET /api/v4/groups/188/subgroups HTTP/1.1" 200 2
2021-02-24 11:04:58,834 - urllib3.connectionpool - DEBUG - https://gitlab.some.domain:443 "GET /api/v4/groups/188/projects HTTP/1.1" 200 None
2021-02-24 11:04:58,914 - urllib3.connectionpool - DEBUG - https://gitlab.some.domain:443 "GET /api/v4/groups/8/subgroups HTTP/1.1" 200 2
...
2021-02-24 11:05:20,124 - urllib3.connectionpool - DEBUG - https://gitlab.some.domain:443 "GET /api/v4/groups/161/projects HTTP/1.1" 200 None
2021-02-24 11:05:20,204 - urllib3.connectionpool - DEBUG - https://gitlab.some.domain:443 "GET /api/v4/groups/13/subgroups HTTP/1.1" 200 2
2021-02-24 11:05:20,351 - urllib3.connectionpool - DEBUG - https://gitlab.some.domain:443 "GET /api/v4/groups/13/projects HTTP/1.1" 200 None
2021-02-24 11:05:20,353 - gitlabber.gitlab_tree - DEBUG - Loading projects tree from gitlab took [-448378:55:53.28]
2021-02-24 11:05:20,355 - gitlabber.gitlab_tree - DEBUG - Fetched root node with [338] projects
2021-02-24 11:05:20,358 - gitlabber.gitlab_tree - DEBUG - Matched include path [/Test Group**] to node [/Test Group/test-project-2-archived]
2021-02-24 11:05:20,358 - gitlabber.gitlab_tree - DEBUG - Matched include path [/Test Group**] to node [/Test Group/test-project-1]
2021-02-24 11:05:20,379 - gitlabber.gitlab_tree - DEBUG - Going to clone/pull [1] groups and [2] projects
2021-02-24 11:05:20,384 - gitlabber.git - DEBUG - cloning new project None/Test Group/test-project-2-archived
2021-02-24 11:05:20,385 - git.cmd - DEBUG - Popen(['git', 'clone', '-v', 'git@gitlab.some.domain:test-group/test-project-2-archived.git', 'None/Test Group/test-project-2-archived'], cwd=/home/someuser/gitlab/TEMP, universal_newlines=True, shell=None, istream=None)
2021-02-24 11:05:21,858 - git.repo.base - DEBUG - Cmd(['git', 'clone', '-v', 'git@gitlab.some.domain:test-group/test-project-2-archived.git', 'None/Test Group/test-project-2-archived'])'s unused stdout: 
2021-02-24 11:05:21,867 - gitlabber.git - DEBUG - cloning new project None/Test Group/test-project-1
2021-02-24 11:05:21,867 - git.cmd - DEBUG - Popen(['git', 'clone', '-v', 'git@gitlab.some.domain:test-group/test-project-1.git', 'None/Test Group/test-project-1'], cwd=/home/someuser/gitlab/TEMP, universal_newlines=True, shell=None, istream=None)
2021-02-24 11:05:22,853 - git.repo.base - DEBUG - Cmd(['git', 'clone', '-v', 'git@gitlab.some.domain:test-group/test-project-1.git', 'None/Test Group/test-project-1'])'s unused stdout: 
2021-02-24 11:05:22,863 - gitlabber.git - DEBUG - Syncing projects took [None]

$ ls None/Test\ Group/
test-project-1  test-project-2-archived
ezbz commented 3 years ago

It's possible to expose the python-gitlab projects archived flag but I think it simply returns only archived projects, so it may not fit what you're asking for.

There are also other fields which might be worth exposing like membership and visibility

ezbz commented 3 years ago

@needleshaped according to this it should be doable, I need to setup a test to see if the archived flag in the python-gitlab library works as expected

TheKangaroo commented 3 years ago

I was about to create the exact same issue. An exclude archive flag would be a great addition. BTW great project @ezbz 😍

needleshaped commented 3 years ago

@ezbz great news! And I was going to propose to substruct lists: all subgroups/projects - archived subgroups/projects at the cost of parsing gtilab tree twice in https://github.com/ezbz/gitlabber/blob/master/gitlabber/gitlab_tree.py... I hope passing archived=False will be possible!