gotec / git2net

An Open Source Python package for the extraction of fine-grained and time-stamped co-editing networks from git repositories.
https://git2net.readthedocs.io
GNU Affero General Public License v3.0
53 stars 16 forks source link

Jupyter notebook example does not work #11

Closed SebastianZug closed 4 years ago

SebastianZug commented 4 years ago

Dear git2net team,

many thanks for your really cool tool! I would like to applied for analyzing student team interactions. Unfortunately, the mentioned tutorial does not work. When I run the following code with git2net 1.3 on Ubuntu 18.04 (Python 3.6.9)

import git
import os
import shutil

git_repo_url = 'https://github.com/gotec/git2net.git'
local_directory = '.'
git_repo_dir = 'git2net4analysis'

if os.path.exists(git_repo_dir):
    shutil.rmtree(git_repo_dir)

git.Git(local_directory).clone(git_repo_url, git_repo_dir)
import git2net

sqlite_db_file = 'git2net.db'

# Remove database if exists
if os.path.exists(sqlite_db_file):
    os.remove(sqlite_db_file)

max_modifications = 1

git2net.mine_git_repo(git_repo_dir, sqlite_db_file, max_modifications=max_modifications)

Cloning works fine, I found a copy in the mentioned folder. But for the second code block I only received the following output

Found no database on provided path. Starting from scratch.
Parallel (8 processes):   0%|          | 0/176 [00:00<?, ?it/s]

but the execution is frozen.

The tutorial is up-todate?

gotec commented 4 years ago

Hi Sebastian,

I am sorry to hear the tutorial is not working for you as expected.

I have just double checked the tutorial and everything works fine for me (I am also on Ubuntu 18.04). Can you try to see if git2net runs with the following options:

git2net.mine_git_repo(git_repo_dir, sqlite_db_file, max_modifications=max_modifications, no_of_processes=1)

This option sets the mining to serial just in case the parallel execution doesn't work for you for some reason. I have exactly the same output as you have other than the execution is proceeding normally. The execution of the second cell takes 42s on 32 cores so it might be a couple of minutes in your case.

Can you further let me know which version of PyDriller is listed for you when running pip list? I have the latest version (1.15.2) installed which works well but I have had some issues in the past with some older versions (which should have been fixed though).

Cheers, Christoph

SebastianZug commented 4 years ago

Hi Christoph,

many thanks for the fast response! The update of PyDriller from 1.10 to 1.15.2 solved my problem.

Greetings

Sebastian

gotec commented 4 years ago

Hi Sebastian,

Great to hear! I'll have a look at what is the issue with 1.10 and update the requirements. Glad we were able to resolve this so quickly.

Enjoy using the tool and feel free to contact me at any time if you have other issues or suggestions!

Best, Christoph

gotec commented 4 years ago

I have updated the requirements for PyDriller in the latest version (1.3.1) of git2net. This should prevent your issue from occurring in the future. With this I will close this issue.

Again thanks a lot for letting me know about this.

Cheers, Christoph