monarch-initiative / mondo-ingest

Coordinating the mondo-ingest with external sources
https://monarch-initiative.github.io/mondo-ingest/
6 stars 3 forks source link

`fatal: detected dubious ownership in repository` #633

Closed joeflack4 closed 1 month ago

joeflack4 commented 1 month ago

Overview

This is a new error that is occurring sometimes during the "tmp/mondo/ artefacts pipeline", i.e. when we are refreshing mondo.owl, mondo-edit.owl, and mondo.sssom.tsv.

Background

It first occurred, and may only be occurring in, the sync-synonyms branch.

This error doesn't happen every time.

Even though this happened after we added mondo-edit.owl to the pipeline, it does not seem to be caused by that addition, because it does not happen every time.

The error

fatal: detected dubious ownership in repository at '/work/src/ontology/tmp/mondo'

Instance 1: Failure during quick-check.pl

../utils/quick-check.pl mondo-edit.obo && robot --catalog catalog-v001.xml convert -i mondo-edit.obo -o mondo-edit.owl
make[2]: Leaving directory '/work/src/ontology/tmp/mondo/src/ontology'
fatal: detected dubious ownership in repository at '/work/src/ontology/tmp/mondo'
To add an exception for this directory, call:

    git config --global --add safe.directory /work/src/ontology/tmp/mondo
make[1]: *** [mondo-ingest.Makefile:373: tmp/mondo_repo_built] Error 128

Instance 2: Failure after not being able to cat a file

cat: tmp/mondo_repo_built: No such file or directory
fatal: detected dubious ownership in repository at '/work/src/ontology/tmp/mondo'
To add an exception for this directory, call:

    git config --global --add safe.directory /work/src/ontology/tmp/mondo
fatal: detected dubious ownership in repository at '/work/src/ontology/tmp/mondo'
To add an exception for this directory, call:

    git config --global --add safe.directory /work/src/ontology/tmp/mondo

I am pretty sure that the error occurs not as a result of the failed cat, but on one of the following 2 git lines as you can see in this snippet (I think it is the rev-parse line).

        current_hash=$$(cat tmp/mondo_repo_built); \
        cd tmp/mondo; \
        git fetch origin; \
        latest_hash=$$(git rev-parse origin/master); \

Logs

log1.txt Log 1 shows 1 instance of "error instance 1", followed by several of "error instance 2". What's going on here is that I'm running the synonym sync for all sources. At the start of every run of the script, it checks to make sure that tmp/mondo/ is up-to-date. The first time it does that, it fails on "error instance 1". Because of that, it is not able to create a new tmp/mondo_repo_built file (which contains the latest commit hash from main), so that file is now missing. So when the script runs subsequent times for other sources, it errors on "instance 2".

I am actually confused why instance 2, erroring after the failed cat, seems to happen 100% of the time after a failed cat, but not otherwise. Even though the refresh of tmp/mondo/ fails after this, it does not cause an error which results in the make run exiting. Since the file isn't there, curent_hash is set to empty string. And then, because of the "dubious error" message, it appears that it is not able to set latest_hash either. So, both are empty strings. So when it compares the hashes, it sees that they are the same, and it determines that "no refresh is needed". Then make just continues.

log2.txt Log 2 shows 1 instance of "error instance 2". What's happening here is that I was doing a test. I ran the refresh of tmp/mondo/, but then I deleted tmp/mondo_repo_built manually and ran it again to see if I could replicate that error instance. I was able to replicate it.

Additional info

I haven't figured out why this error started happening.

It appears that errors like in instance 2 (erroring during the tmp/mondo/ refresh, for whatever reason), cause errors like instance 1 (errors during the steps before it runs tmp/mondo/.

At first I tried a few things that failed. But I think I may have found a solution:

matentzn commented 1 month ago

Before we go to a solution, it would be good if you could share the outcome of your research on Google etc that could explain why this happens sometimes and sometimes not?

joeflack4 commented 1 month ago

@matentzn Indeed. Sorry it was late and I didn't get to touch on everything I wanted to.

What the 'net says I did look into this, but the answer I got was not helpful. The only thing that keeps coming up is basically:

The Git error "fatal: detected dubious ownership in repository" usually indicates an issue with file or directory permissions or ownership within a Git repository. This can happen when Git detects inconsistencies in ownership information between files or directories. It can also occur when a user attempts to run a Git command in a repository owned by a different user.

This is basically what I interpreted the error to mean though. However, it's doesn't appear to be helpful in our situation. We have this tmp/mondo/ nested within mondo-ingest, which appears to me to have something to do with it. However if that's the case, then why did this never happen before?

Possible cause As I mentioned in the OP, this only started happening after adding mondo-edit.owl. Perhaps there is about how mondo creates mondo-edit.owl that is significantly different from how it creates mondo.owl. Perhaps it runs things in different sub-shells, which might be what is tripping this up.

matentzn commented 1 month ago

I ran:

sh run.sh make sync-synonyms -B

on the branch sync-without-fix and get something entirely different:

Makefile:746: warning: ignoring old recipe for target 'help'
git config --global --add safe.directory /work/src/ontology/tmp/mondo
if [ false = true ]; then wget http://purl.obolibrary.org/obo/mondo.owl -O tmp/mondo_repo_built; else cd tmp &&\
        rm -rf ./mondo/ &&\
        cd mondo/src/ontology &&\
        make mondo.owl mappings mondo-edit.owl -B MIR=false IMP=false MIR=false &&\
        latest_hash=$(git rev-parse origin/master) &&\
        echo "$latest_hash" > tmp/mondo_repo_built &&\
        cp tmp/mondo_repo_built mappings/mondo.sssom.tsv mondo.owl mondo-edit.owl ../../../; fi
/bin/sh: 3: cd: can't cd to mondo/src/ontology
make[2]: *** [mondo-ingest.Makefile:377: tmp/mondo_repo_built] Error 2
make[2]: Leaving directory '/work/src/ontology'
make[1]: *** [mondo-ingest.Makefile:389: refresh-mondo-clone] Error 2
make[1]: Leaving directory '/work/src/ontology'
make: *** [mondo-ingest.Makefile:614: tmp/mondo-excluded-synonyms.tsv] Error 2
Command exited with non-zero status 2
### DEBUG STATS ###
Elapsed time: 1:20.59
Peak memory: 7832804 kb
joeflack4 commented 1 month ago

@matentzn Oh man! I'm sorry for wasting your time. Here's what happened.

After our meeting, I knew it would be a super quick change to push this branch for you and I was eager to get you started off quickly.

To do that, I needed to remove 2 lines of: git config --global --add safe.directory /work/src/ontology/tmp/mondo

However, I only removed 1 of those lines, and for the 2nd, I accidentally removed this line instead, which looks similar if one is just glancing: git clone --depth 1 https://github.com/monarch-initiative/mondo &&\

That's why you got the error; because it didn't clone, and there was nothing to cd into.

The reason I didn't mention this on Slack is because I noticed and corrected the issue quickly, and pushed an amended commit, so I did not think this would happen to you. I'm so sorry.

If you check out the current sync-without-fix, it will have what you need.

@matentzn However as an alternative, if you want, you can also look at a copy of the logs from a couple of my failures, along with a short explanation of what the log shows, which I'm uploading to the OP now.

matentzn commented 1 month ago

I pulled the changed and tried it again:

Makefile:746: warning: ignoring old recipe for target 'help'
if [ false = true ]; then wget http://purl.obolibrary.org/obo/mondo.owl -O tmp/mondo_repo_built; else cd tmp &&\
        rm -rf ./mondo/ &&\
        cd mondo/src/ontology &&\
        make mondo.owl mappings mondo-edit.owl -B MIR=false IMP=false MIR=false &&\
        latest_hash=$(git rev-parse origin/master) &&\
        echo "$latest_hash" > tmp/mondo_repo_built &&\
        cp tmp/mondo_repo_built mappings/mondo.sssom.tsv mondo.owl mondo-edit.owl ../../../; fi
/bin/sh: 3: cd: can't cd to mondo/src/ontology
make[2]: *** [mondo-ingest.Makefile:376: tmp/mondo_repo_built] Error 2
make[2]: Leaving directory '/work/src/ontology'
make[1]: *** [mondo-ingest.Makefile:387: refresh-mondo-clone] Error 2
make[1]: Leaving directory '/work/src/ontology'
make: *** [mondo-ingest.Makefile:605: tmp/mondo-synonyms-scope-type-xref.tsv] Error 2
Command exited with non-zero status 2
### DEBUG STATS ###
Elapsed time: 0:43.31
Peak memory: 3434112 kb

Not sure what this means..

joeflack4 commented 1 month ago

@matentzn The pull shouldn't have worked because it is an amended commit. If you want to run this, you'll need to scrap your branch and git fetch; git checkout origin/sync-without-fix.

Again, sorry about this. I made this amendment like 5-10 minutes after I initially pushed it, I think. But it looks like you checked it out within that window.

matentzn commented 1 month ago

an amended commit

Please never ever amend commits that are already pushed to a shared repo.. Locally it is of course fine, you can do what you want, but when you push and other people use the branch it leads to a lot of confusion..

matentzn commented 1 month ago

In any case, I dont know what I am doing wrong..

sh run.sh make sync-synonyms -B

I keep getting:

Makefile:746: warning: ignoring old recipe for target 'help'
if [ false = true ]; then wget http://purl.obolibrary.org/obo/mondo.owl -O tmp/mondo_repo_built; else cd tmp &&\
        rm -rf ./mondo/ &&\
        cd mondo/src/ontology &&\
        make mondo.owl mappings mondo-edit.owl -B MIR=false IMP=false MIR=false &&\
        latest_hash=$(git rev-parse origin/master) &&\
        echo "$latest_hash" > tmp/mondo_repo_built &&\
        cp tmp/mondo_repo_built mappings/mondo.sssom.tsv mondo.owl mondo-edit.owl ../../../; fi
/bin/sh: 3: cd: can't cd to mondo/src/ontology
make[2]: *** [mondo-ingest.Makefile:376: tmp/mondo_repo_built] Error 2
make[2]: Leaving directory '/work/src/ontology'
make[1]: *** [mondo-ingest.Makefile:387: refresh-mondo-clone] Error 2
make[1]: Leaving directory '/work/src/ontology'
make: *** [mondo-ingest.Makefile:605: tmp/mondo-synonyms-scope-type-xref.tsv] Error 2
Command exited with non-zero status 2
### DEBUG STATS ###
Elapsed time: 0:42.32
Peak memory: 2698028 kb

Maybe something around tmp/mondo does not work. When running

 sh run.sh make refresh-mondo-clone -B

I get the error above.

Check this:

rm -rf ./mondo/ &&\
cd mondo/src/ontology &&\

How do these two lines make any sense in that branch?

joeflack4 commented 1 month ago

@matentzn It looks like you did some kind of merge commit that for some reason deleted this line again: git clone --depth 1 https://github.com/monarch-initiative/mondo &&\

I added it back in a new commit: https://github.com/monarch-initiative/mondo-ingest/commit/65e01bff0564d010fd2e4cb25e15c8388aae7adc

You should be able to pull, and maybe just double check to be sure that line is there, and then run again.

twhetzel commented 1 month ago

@matentzn any updates?

matentzn commented 1 month ago

Sorry I cant replicate this problem. I am ok to do the following:

  1. Ignore this issue here entirely and
  2. Finalising #594

Hopeing that this also fixes the issue here.. But I am out of time now to work on this specific problem that I cant even replicate.

joeflack4 commented 1 month ago

This issue persisted on the sync1-synonyms branch, likely as a result of adding mondo-edit.owl to its pipeline. However, we've implemented a better way to get exclusions that does not require this artefact, and so this problem should now no longer exist. If it comes back, I'll reopen.