'versioner.py' leading to build errors on public apps

TheGreatCid commented 4 days ago

Bug Description

I've noticed that the following lines in versioner.py are causing CI documentation builds to fail in public apps

def get_yamlcontents(self, commit):
        """ load yaml file contents at time of supplied commit """
        # Load the yaml file at the given commit; the location changed
        # from module_hash.yaml -> versioner.yaml at changed_commit
        changed_commit = '2bd844dc5d4de47238eab94a3a718e9714592de1'
        if self.git_ancestor(changed_commit, commit) and commit != changed_commit:
            _file = 'versioner.yaml'
        else:
            _file = 'module_hash.yaml'
        yaml_file = os.path.abspath(os.path.join(MOOSE_DIR, "scripts", _file))

I've noted this in RACCOON and Zapdos

This is the error seen,

 Submodule 'large_media' (https://github.com/idaholab/large_media) registered for path 'large_media'
Cloning into '/home/runner/work/zapdos/zapdos/moose/large_media'...
Submodule path 'large_media': checked out '19f9bb9a092270d4a317db0c6d21a7a495af1e27'
MooseDocs.build (MainProcess): Loading configuration files
fatal: Not a valid commit name 2bd844dc5d4de47238eab94a3a718e9714592de1
Traceback (most recent call last):
  File "/home/runner/work/zapdos/zapdos/doc/./moosedocs.py", line 29, in <module>
    sys.exit(main.run())

The issue began with MOOSE commit a166589 when versioner.py was added to load_config.py(as well as other files for setting up python tests)

Steps to Reproduce

Running a build test on a public app causes the error. Here is an example .yml from RACCOON

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4
        with:
          submodules: recursive

      - name: Setup environment
        uses: mamba-org/setup-micromamba@v1
        with:
          environment-name: moose
          condarc: |
            channels:
              - conda-forge
              - https://conda.software.inl.gov/public
          init-shell: bash
          create-args: >-
            moose-dev
            lcov

      - name: Compile RACCOON (without coverage)
        if: ${{ github.event_name == 'pull_request' }}
        shell: bash -el {0}
        run: |
          make -j 2

      - name: Build documentation 
        shell: bash -el {0}
        run: |
          mkdir gh-pages
          cd doc
          ./moosedocs.py check
          ./moosedocs.py build --destination ../gh-pages/ # ERROR OCCURS HERE

Impact

Prevents passing documentation tests in public app CI pipelines.

[Optional] Diagnostics

No response

lindsayad commented 4 days ago

FYI @loganharbour @milljm

TheGreatCid commented 3 days ago

Most likely the issue is that the script is looking for the commit in whatever public app its running on when it should be looking for the commit in the submodule(MOOSE)

TheGreatCid commented 1 day ago

@lindsayad @loganharbour @milljm I am bumping this issue as I cannot update documentation for RACCOON while this issue persists.

I've tried to implement some fixes but have yet to succeed.

milljm commented 1 day ago

I am able to build Zapdos's documentation. I am using my Mac M2. Did the standard git clone, git submodule update --init --recursive, make, cd doc/; ./moosedocs.py build.

Let me look into trying RACCOON...

TheGreatCid commented 1 day ago

I am able to build using a local repository; however, If I use a docker image then it no longer works.

Zapdos documentation is failing

https://github.com/shannon-lab/zapdos/actions/runs/11214824785

milljm commented 1 day ago

Oh, within docker? As in our moose-dev image?

TheGreatCid commented 1 day ago

This is on the ubuntu-latest image. Seems that both RACCOON and Zapdos use that image

TheGreatCid commented 1 day ago

Both started failing after merging in moose commit a166589ac19f760224c6cfe1bb7cbaa84884e776

milljm commented 1 day ago

@loganharbour would the manner in which a repo was cloned cause that phantom file symlink issue we dealt with? Like a shallow clone perhaps?

TheGreatCid commented 1 day ago

I tried using 'fetch-depth: 0' for the check-out action because that was also my thought. It didn't seem to work. I double-checked that the commit being errored on was present in the MOOSE clone as well. I'm convinced that what is happening is that it is checking for the commit in the Moose App rather than in Moose itself.

milljm commented 1 day ago

running the same commands as what the runner did, I can certainly reproduce the issue:

 1286  2024-10-11 12:59:56 = mkdir zapdos
 1287  2024-10-11 12:59:57 = cd zapdos/
 1288  2024-10-11 12:59:59 = git init .
 1289  2024-10-11 13:00:07 = git remote add origin https://github.com/shannon-lab/zapdos
 1290  2024-10-11 13:00:14 = git -c protocol.version=2 fetch --no-tags --prune --no-recurse-submodules --depth=1 origin +a395501ffba0044174f12d228d0d170170ff75cf:refs/remotes/origin/master
 1291  2024-10-11 13:00:30 = git sparse-checkout disable
 1292  2024-10-11 13:00:34 = git config --local --unset-all extensions.worktreeConfig
 1293  2024-10-11 13:00:42 = git checkout --progress --force -B master refs/remotes/origin/master
 1294  2024-10-11 13:01:00 = git submodule sync
 1295  2024-10-11 13:01:08 = git -c protocol.version=2 submodule update --init --force --depth=1
 1296  2024-10-11 13:02:12 = git submodule foreach git config --local gc.auto 0
 1297  2024-10-11 13:02:18 = git log -1 --format='%H'
 1298  2024-10-11 13:02:42 = cd moose/scripts/
 1299  2024-10-11 13:02:48 = ./versioner.py mpi 2bd844dc5d4de47238eab94a3a718e9714592de1

$ ./versioner.py mpi 2bd844dc5d4de47238eab94a3a718e9714592de1
fatal: Not a valid commit name 2bd844dc5d4de47238eab94a3a718e9714592de1

Deffinitely going to be in the way we're cloning the repo I think... I am digging into it.

Oh totally... this:

1295  2024-10-11 13:01:08 = git -c protocol.version=2 submodule update --init --force --depth=1

$ cd moose
$ git log
commit ce4d9bc7ca0fc1d120479de73ef00ec3af31b2a3 (grafted, HEAD)
Author: MOOSE maintenance <bounces@inl.gov>
Date:   Mon Oct 7 03:08:11 2024 -0600

    Merge commit '86067ab6c0db8867b38676cc5ecd2f61d00ae62c'
$

With only one commit to work with inside MOOSE, I don't think versioner.py can operate correctly.

milljm commented 1 day ago

This is necessary for our unittesting of versioner. We obviously need a way to avoid this, when strictly using the versioner to gather information on HEAD.

        # Load the yaml file at the given commit; the location changed
        # from module_hash.yaml -> versioner.yaml at changed_commit
        changed_commit = '2bd844dc5d4de47238eab94a3a718e9714592de1'
        if self.git_ancestor(changed_commit, commit) and commit != changed_commit:
            _file = 'versioner.yaml'
        else:
            _file = 'module_hash.yaml'

Edit: 😆 exactly what OP says.

TheGreatCid commented 1 day ago

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
          submodules: recursive

      - name: Check if commit exists
        run: |
            if git rev-parse --verify 2bd844dc5d4de47238eab94a3a718e9714592de1 >/dev/null 2>&1; then
              echo "✅ Commit exists."
            else
              echo "❌ Commit does not exist."
              exit 1
            fi

Running this series of actions returns a positive on whether the commit exists. However later on I still run into the same error stating the commit does not exist.

So, the fetch is grabbing the commit, at least it seems to be.

Edit: This doesn't invalidate what you said, just is some extra info

milljm commented 1 day ago

I am requiring to do something vaguely similar in an upcoming change to versioner (skip things that are not yet being tracked). The following changes allows for operation of versioner.py at the tip of HEAD:

diff --git a/scripts/versioner.py b/scripts/versioner.py
index 34022c40..079bb64e 100755
--- a/scripts/versioner.py
+++ b/scripts/versioner.py
@@ -85,7 +85,9 @@ class Versioner:
         # Load the yaml file at the given commit; the location changed
         # from module_hash.yaml -> versioner.yaml at changed_commit
         changed_commit = '2bd844dc5d4de47238eab94a3a718e9714592de1'
-        if self.git_ancestor(changed_commit, commit) and commit != changed_commit:
+        if commit == 'HEAD':
+            _file = 'versioner.yaml'
+        elif self.git_ancestor(changed_commit, commit) and commit != changed_commit:
             _file = 'versioner.yaml'
         else:
             _file = 'module_hash.yaml'
@@ -209,7 +211,9 @@ class Versioner:
     def do_sortlist(self, commit):
         """ determine if we need to sort the infuential file list """
         changed_commit = '0e0785ee8a25742715b49bc26871117b788e7190'
-        if self.git_ancestor(changed_commit, commit) and commit != changed_commit:
+        if commit == 'HEAD':
+            return True
+        elif self.git_ancestor(changed_commit, commit) and commit != changed_commit:
             return True
         return False

Without much testing, anyway. Not sure if this will get the nod from @loganharbour or not. But I honestly don't see any other way to avoid the error, when operating on only one commit.

milljm commented 1 day ago

You know... I just realized that returning True on HEAD is probably what we should have implemented _at these changes being specified with changed_commit_. Since moving forwards in the repository is always True.

e.g We routinely use versioner.py without any arguments. So most of the time, it's always HEAD. And with the above changes, I see a huge decrease in execution time. Since we don't have to check. And what happens when we go back in time? Well, if commit == 'HEAD' won't be there... it's simplistically elegant.

Thank you for reporting the issue!

TheGreatCid commented 1 day ago

Thanks for the help!

idaholab / moose