ezbz / gitlabber

Gitlabber - clones or pulls entire groups tree from gitlab
MIT License
467 stars 78 forks source link

Tree is empty with latest version #128

Closed odin568 closed 1 week ago

odin568 commented 1 week ago

Describe the bug I use gitlabber a lot and it worked fine with the older version (<1.2.0). Thanks in general for updating and keeping this repo alive, but it seems that there is an issue. Perhaps related to the warning? Same command worked fine with older version, I have it in a batch script :)

To Reproduce

  1. Include the full command line with all arguments and the output in verbose (-v) mode gitlabber -t 'XXXXXXXXXXXXXXXX' -u 'https://gitlab.XXXXXX.com' -m http -n name -i '/Platform/Out/Area1**,/Platform/Data/Area2**' -a exclude .
2024-07-01 09:12:14,802 - gitlabber.cli - DEBUG - verbose=[True], print=[False], log level set to [10] level
2024-07-01 09:12:14,924 - gitlabber.cli - DEBUG - Reading projects tree from gitlab at [https://gitlab.XXXXXX.com]
2024-07-01 09:12:14,924 - gitlabber.gitlab_tree - DEBUG - Loading projects tree gitlab server [https://gitlab.XXXXXX.com]
2024-07-01 09:12:14,926 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): gitlab.XXXXXX.com:443
2024-07-01 09:12:15,555 - urllib3.connectionpool - DEBUG - https://gitlab.XXXXXX.com:443 "GET /api/v4/groups?as_list=False&archived=False HTTP/11" 200 23271
C:\Users\XXXXX\AppData\Local\Programs\Python\Python312\Scripts\gitlabber.exe\__main__.py:7: UserWarning: Calling a `list()` method without specifying `get_all=True` or `iterator=True` will return a maximum of 20 items. Your query returned 20 of 212 items. See https://python-gitlab.readthedocs.io/en/v4.7.0/api-usage.html#pagination for more details. If this was done intentionally, then this warning can be supressed by adding the argument `get_all=False` to the `list()` call. (python-gitlab: C:\Users\XXXXX\AppData\Local\Programs\Python\Python312\Scripts\gitlabber.exe\__main__.py:7)
  sys.exit(main())
2024-07-01 09:12:15,683 - gitlabber.gitlab_tree - DEBUG - Loading projects tree from gitlab took [00:00:00.02]
2024-07-01 09:12:15,684 - gitlabber.gitlab_tree - DEBUG - Fetched root node with [1] projects
2024-07-01 09:12:15,684 - gitlabber.cli - CRITICAL - The tree is empty, check your include/exclude patterns or run with more verbosity for debugging

Expected behavior A clear and concise description of what you expected to happen.

Versions (please complete the following information):

Additional context Add any other context about the problem here.

ezbz commented 1 week ago

use gitlabber a lot and it worked fine with the older version (<1.2.0).

Hi @odin568, can you check the version 1.2.1 that's on pypi? https://pypi.org/p/gitlabber/ (it's an update to urllib3 version 2.2.2) and tell me if the problem persists?

I checked your command with two includes and it worked:

gitlabber -u "http://gitlab.com" -i '/Group Test/Subgroup Test/gitlab-project-submodule**,/Group Test/Subgroup Test/gitlabber-sample-submodule**' -a exclude --verbose .

If the problem persists can you check the --print actually lists the repositories you are expecting to see?

odin568 commented 1 week ago

Yes, I updated to 1.2.1.
Somehow strange, I also cannot get it working with old version. It worked for a year now. Only the warning about the "get_All=True" I am not sure if it came before, this is a change in the python-gitlab library. --print also does not give more output really - also list is empty.

The filters for include worked fine for a year, so don't expect something is wrong with them. Also the token is valid, otherwise I would get an error...

ezbz commented 1 week ago

@odin568

Only the warning about the "get_All=True" I am not sure if it came before, this is a change in the python-gitlab library.

Yes this is a result of the upgrade of python-gitlab it means that all projects will be fetched with no pagination and therefor unrelated.

This line:

2024-07-01 09:12:15,684 - gitlabber.gitlab_tree - DEBUG - Fetched root node with [1] projects

indicates you have only one project in your gitlab (before filtering), if that is the case why do you need includes? how many projects are you expecting to see under these paths?

I think the problem lies elsewhere, please also note you are filtering archived projects with the -a flag

odin568 commented 1 week ago

Are you sure? It says: Calling alist()method without specifyingget_all=Trueoriterator=Truewill return a maximum of 20 items. Your query returned 20 of 212 items. Looks to me as if gitlabber is now thinking it works on complete list but actually only works on first 20. Can it be that my root node is missing then? In summary, there are hundreds of repositories in our gitlab instance :)

ezbz commented 1 week ago

Are you sure? It says:

@odin568 you are more than welcome to look at the code: https://github.com/ezbz/gitlabber/blob/490056e626f5184813038e9eefe24e28acbca6db/gitlabber/gitlab_tree.py#L118

projects are fetched with the get_all=True parameter

and in any case in your situation nothing is returned so I would check the requests with curl and see what they return

ezbz commented 1 week ago

@odin568

released version 1.2.2 see if the get_all=True for groups fixes your issue I've also added some debug level statements for filtering try again and paste the verbose information here if it still doesn't solve your problem

odin568 commented 1 week ago

Now it works 🎆 Thank you very much!