verilog-to-routing / vtr-verilog-to-routing

Verilog to Routing -- Open Source CAD Flow for FPGA Research
https://verilogtorouting.org
Other
979 stars 378 forks source link

Upgrade Self-Hosted Runners to Node20 #2573

Open AlexandreSinger opened 4 weeks ago

AlexandreSinger commented 4 weeks ago

As described in this blog post, GitHub Actions are transitioning from Node16 to Node20: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/

All of the CI tests that are running on the GitHub-hosted runners has been moved to Node20 by just changing the version of the Actions in PR #2568

The self-hosted runners were unable to be upgraded, giving a warning that the machine did not have Node20 available, only Node16. TODO statements were added to the CI test script to upgrade the runners before upgrading the necessary actions.

The blog post above makes it clear that the self-hosted runners need to be upgraded to v2.308.0 or later:

image

Once the self-hosted runners are upgraded, we can upgrade the actions to fully resolve the deprecations.

AlexandreSinger commented 2 weeks ago

Here are the logs from the failed CI run which arose from trying to upgrade the actions on the self-hosted runners:

image

It looks like the current version of the self-hosted runners is v2.316.1 which should have node20 installed; but for some reason it doesn't which is quite odd.

The CI run: https://github.com/verilog-to-routing/vtr-verilog-to-routing/actions/runs/9279402994/job/25531941254

AlexandreSinger commented 2 weeks ago

@vaughnbetz It looks like the version of the self-hosted runners is correct... But there still seems to be something wrong since they do not support node20 yet. Ill send an update in the email chain.

AlexandreSinger commented 1 day ago

Running into tons of issues with this...

The first thing I tried was using that setup-node action before checkout (https://github.com/actions/setup-node ). This yielded the exact same error: Screenshot from 2024-06-27 15-37-28 The issue going on here is that it looks like it is failing BEFORE it even runs any of the tasks in the job. This does not make any sense since, if we truly were using runner version 2.317, it should recognize the node20 parameter. Something is really really fishy here. Its almost as if it is on an older version of the runner, but it is saying that it is on a newer one...

I then tried changing us to Ubuntu 24.04 just to see if that would work (since we plan on upgrading to that in the future); but even that had the same error: Screenshot from 2024-06-27 15-39-26 I am not sure for a fact, but I assume Ubuntu 24.04 has Node20 installed; so this error is now super weird.

I then tried deleting the container line all together (as was recommended in the VTR industry sync): Screenshot from 2024-06-27 15-40-54 This caused the strangest error: The CI hanged on "Setting up VM"; I let it hang for around 10 minutes before killing it (it usually takes 1 min to setup).

I then tried setting he container to container: ubuntu-24.04 so that it would match the GitHub runners. The CI really really did not like this. It created an infinite loop where it would set up the VM and then immediately tear it down: Screenshot from 2024-06-27 15-43-33 The errors all look something like this: Screenshot from 2024-06-27 15-44-11 Clearly no container exists with this name; but this is some crazy behaviour when the container cannot be found.

I tried googling around, but no one appears to be running into this exact same issue. Everyone seems to resolve this issue by upgrading the runner version to a more recent version. I am beginning to not trust the version returned by the log, but I am super confused.

One idea I have is we can use the container image that we generate in VTR. That way we know the image has everything we need; however, it leads to an issue where our build depends on itself. For example, if the release build is failing, we would have trouble fixing it since we would rely on the release builds container.

The current working PR on this is PR #2632

@vaughnbetz What do you think about this mess? Who originally set up the self-hosted runners who we can talk to about this?

vaughnbetz commented 1 day ago

Ugh. Thanks for investigating @AlexandreSinger . Adding @kmurray and @tangxifan and @jgoeders in case they have any ideas. @kgugala may have set up the original self-hosted runners; Karol, any ideas much appreciated!

jgoeders commented 1 day ago

Based on the above, it sounds like we need to get node20 installed in the image before any other actions are run.

I am not sure for a fact, but I assume Ubuntu 24.04 has Node20 installed; so this error is now super weird.

The docker image is typically a completely stripped down version, without most of the packaging that comes when you install Ubuntu on your own machine.

I just tested a bare ubuntu:24.04 docker image and indeed it does not include any node version.

jgoeders@jg-laptop:~$ docker run -it ubuntu:20.04
Unable to find image 'ubuntu:20.04' locally
20.04: Pulling from library/ubuntu
9ea8908f4765: Pull complete
Digest: sha256:0b897358ff6624825fb50d20ffb605ab0eaea77ced0adb8c6a4b756513dec6fc
Status: Downloaded newer image for ubuntu:20.04
root@5ddf86f379f6:/# node -v
bash: node: command not found
root@5ddf86f379f6:/#

I think to resolve this we could either:

  1. Figure out how to specify other options along with the container: command that will allow you to configure the container to install node20. For example, I asked ChatGPT to give me a docker configuration file that would install node20 on ubuntu:jammy:

    # Use the official Ubuntu Jammy image as a base
    FROM ubuntu:jammy
    
    # Set the environment variable to noninteractive
    ENV DEBIAN_FRONTEND=noninteractive
    
    # Install necessary packages and Node.js 20
    RUN apt-get update && \
        apt-get install -y curl gnupg && \
        curl -fsSL https://deb.nodesource.com/setup_20.x | bash - && \
        apt-get install -y nodejs && \
        apt-get clean && \
        rm -rf /var/lib/apt/lists/*
    
    # Verify installation
    RUN node -v && npm -v
    
    # Set working directory
    WORKDIR /usr/src/app
    
    # Copy application files
    COPY . .
    
    # Specify the command to run the application
    CMD ["node", "app.js"]

    This page seems to have some documentation about how to configure the container.

  2. Create our own container that we host on Docker Hub that is ubuntu:jammy with node20 installed. You mentioned

    One idea I have is we can use the container image that we generate in VTR. That way we know the image has everything we need; however, it leads to an issue where our build depends on itself. ...but we could just create a bare-bones dockerhub image that is different from this VTR version.

Hope this helps.