Some repositories yield a memory error and cannot be parsed. I think they are cloned locally and that results in the memory error.
Example:
[WARNING] Executing query_readme_history with arguments (github_user_cleaned_url cylammarco/ASPIRED-example
readme_path README.md
Name: 97, dtype: object, 'github_user_cleaned_url', <github.MainClass.Github object at 0x7f88e31295d0>) failed:
Traceback (most recent call last):
File "/home/eidf048/eidf048/kmoraw/rse-repo-analysis/github/utils.py", line 15, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/eidf048/eidf048/kmoraw/rse-repo-analysis/github/crawl_contents.py", line 29, in query_readme_history for commit in repo_readme.traverse_commits():
File "/home/eidf048/eidf048/kmoraw/tools/miniconda3/envs/sw_mentions/lib/python3.11/site-packages/pydriller/repository.py", line 213, in traverse_commits with self._prep_repo(path_repo=path_repo) as git:
File "/home/eidf048/eidf048/kmoraw/tools/miniconda3/envs/sw_mentions/lib/python3.11/contextlib.py", line 137, in __enter__
return next(self.gen) ^^^^^^^^^^^^^^
File "/home/eidf048/eidf048/kmoraw/tools/miniconda3/envs/sw_mentions/lib/python3.11/site-packages/pydriller/repository.py", line 177, in _prep_repo
local_path_repo = self._clone_remote_repo(self._clone_folder(), path_repo) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/eidf048/eidf048/kmoraw/tools/miniconda3/envs/sw_mentions/lib/python3.11/site-packages/pydriller/repository.py", line 158, in _clone_remote_repo
Repo.clone_from(url=repo, to_path=repo_folder)
File "/home/eidf048/eidf048/kmoraw/tools/miniconda3/envs/sw_mentions/lib/python3.11/site-packages/git/repo/base.py", line 1308, in clone_from
return cls._clone( ^^^^^^^^^^^
File "/home/eidf048/eidf048/kmoraw/tools/miniconda3/envs/sw_mentions/lib/python3.11/site-packages/git/repo/base.py", line 1219, in _clone
finalize_process(proc, stderr=stderr)
File "/home/eidf048/eidf048/kmoraw/tools/miniconda3/envs/sw_mentions/lib/python3.11/site-packages/git/util.py", line 419, in finalize_process
proc.wait(**kwargs)
File "/home/eidf048/eidf048/kmoraw/tools/miniconda3/envs/sw_mentions/lib/python3.11/site-packages/git/cmd.py", line 604, in wait
raise GitCommandError(remove_password_if_present(self.args), status, errstr) git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
cmdline: git clone -v -- https://github.com/cylammarco/ASPIRED-example /tmp/tmp3o7vzcbo/ASPIRED-example
stderr: 'Cloning into '/tmp/tmp3o7vzcbo/ASPIRED-example'...
POST git-upload-pack (227 bytes)
fatal: packfile /tmp/tmp3o7vzcbo/ASPIRED-example/.git/objects/pack/pack-daffd17f5b2a5c44a37a64bb20100e9ef7b761cc.pack cannot be mapped: Cannot allocate memory
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'
'
Exception in thread Thread-3895 (pump_stream):
Traceback (most recent call last):
File "/home/eidf048/eidf048/kmoraw/tools/miniconda3/envs/sw_mentions/lib/python3.11/site-packages/git/cmd.py", line 141, in pump_stream
handler(line)
MemoryError
Some repositories yield a memory error and cannot be parsed. I think they are cloned locally and that results in the memory error.
Example: