bssw-psip / reposcanner

A compact repository data mining toolkit
Other
4 stars 0 forks source link

CommitInfoMiningRoutine Issue #38

Open frobnitzem opened 1 year ago

frobnitzem commented 1 year ago

The CommitInfoMiningRoutine failed on LLNL/RAJA - potentially missing development branches or something.

Reproducer:

rm -fr workspace outputs notebook reposcanner.log
mkdir -p workspace outputs notebook
mkdir -p inputs
cat >inputs/config.yml <<.
routines:
  - CommitInfoMiningRoutine
.

cat >inputs/repositories.yml <<.
proj:
  name: A project
  urls: 
  - https://github.com/LLNL/RAJA
.

reposcanner --credentials inputs/credentials.yml \
            --config inputs/config.yml \
            --repositories inputs/repositories.yml \
            --workspaceDirectory workspace \
            --outputDirectory outputs \
            --notebookOutputPath notebook

Output:

'1191365'
0it [00:00, ?it/s]❌ Routine (LLNL/RAJA --> CommitInfoMiningRoutineRequest) failed. Reason: OfflineRepositoryRoutine Encountered an unexpected exception (<class '_pygit2.GitError'>).
reference 'refs/heads/master' not found
1it [00:13, 13.86s/it]
'1191365'
% cat reposcanner.log
DEBUG:asyncio:Using selector: KqueueSelector
DEBUG:asyncio:Using selector: KqueueSelector
rmmilewi commented 10 months ago

I got a fix for this in the parallelOfflineCommitMining branch. There was a problem where using remote callback while cloning the repo with pygit2 would cause pygit2 to become confused about what the proper name is for the main branch. I'll close this issue once that gets merged in.

frobnitzem commented 10 months ago

Were you able to merge my async patch to add parallelism?

On Thu, Dec 21, 2023, 2:07 PM Reed Milewicz @.***> wrote:

I got a fix for this in the parallelOfflineCommitMining branch. There was a problem where using remote callback while cloning the repo with pygit2 would cause pygit2 to become confused about what the proper name is for the branch. I'll close this issue once that gets merged in.

— Reply to this email directly, view it on GitHub https://github.com/bssw-psip/reposcanner/issues/38#issuecomment-1866802654, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARDW54L4VVYTI37G2O324LYKSCGZAVCNFSM6AAAAAA5WF35ZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWHAYDENRVGQ . You are receiving this because you authored the thread.Message ID: @.***>