actions / runner

The Runner for GitHub Actions :rocket:
https://github.com/features/actions
MIT License
4.88k stars 956 forks source link

Support for custom or shorter base workdir paths #1676

Open MichaelJJ opened 2 years ago

MichaelJJ commented 2 years ago

Currently, the base working directory for actions exists within a sub folder of the installation directory and duplicates the repository twice. This can be an issue for repos with longer names along with certain tools (Python VENV, MSBuild).

There is a workflow yaml option to set the base working directory, but this is not very portable and the path can be runner specific.

A few things could help with this issue:

  1. Instead of using the pattern <install_dir>/_work/<repo_name>/<repo_name>/ use the path <install_dir>/_work/<guid_or_short_unique_string/
  2. Add a setting or environment variable to change the base path of <install_dir>/_work/ to a different path on disk.
MichaelJJ commented 2 years ago

To add, https://github.com/actions/checkout also seems to only checkout to the default working directory, which is not possible on systems where long path support can't be enabled.

ruvceskistefan commented 2 years ago

Hi @MichaelJJ, Thanks for the reported issue. I will label this issue as a runner feature, so we will work on that in the coming period.

MichaelJJ commented 2 years ago

Great, thanks. This is causing some pain for multiple members of our Enterprise and would be great to have resolved!

MichaelJJ commented 2 years ago

Hello, any update on this?

chrisnjohnson commented 2 years ago

I am running into this issue as well and would also appreciate an update on it.

It would be great to have the ability to control the setting of the GITHUB_WORKSPACE variable so that the github checkout action doesn't require checking out the repo into such a long base path. Currently the path is set to <runner_install_dir>/_work/<repo_name>/<repo_name>, and during the runner install I have the ability to control the <runner_install_dir>/_work location, but there is currently no way to control the <repo_name>/<repo_name> part of the path. Being able to change this to just a single <repo_name> or something else that I can control would be extremely helpful. Thanks!

kellycouch commented 2 years ago

Visual Studio NMAKE is an example of popular tooling that has a 256 character path limit. We have a project repo is large and has deep, nested directories resulting in long paths. Combined with GitHub current work directory scheme our build paths are > 256 characters and results build errors.

Documented limitation: https://docs.microsoft.com/en-us/cpp/error-messages/tool-errors/nmake-fatal-error-u1076?view=msvc-170

baparham commented 2 years ago

@ruvceskistefan any updates on this?

fhammerl commented 2 years ago

For users struggling with this on self-hosted runners:

Add a setting or environment variable to change the base path of /_work/ to a different path on disk.

As you run config.sh configure a runner, you have an option to set the working directory. Run config.sh manually or use config.sh <other_params> --work /target/dir/ to try this.

Note that _work doesn’t have to be under - I hope this can provide some relief.

rentziass commented 2 years ago

As a workaround to this you could also create a symlink to $GITHUB_WORKSPACE to shorten the path. For example in an Ubuntu runner you could

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Symlink
        run: ln -s $GITHUB_WORKSPACE /home/runner/work/project
baparham commented 2 years ago

Thanks for the tips! Setting the base work directory is helpful but only gets us so far. If your repo name is rather long (e.g. in an enterprise environment where repo names may have fixed hierarchical names and be quite long) the fact that we can't rename the folder the runner puts the repo in underneath the work folder is the root problem with long paths here.

kraszkow commented 2 years ago

Hi @MichaelJJ, Thanks for the reported issue. I will label this issue as a runner feature, so we will work on that in the coming period.

@ruvceskistefan - any update on this?

For users struggling with this on self-hosted runners:

Add a setting or environment variable to change the base path of /_work/ to a different path on disk.

As you run config.sh configure a runner, you have an option to set the working directory. Run config.sh manually or use config.sh <other_params> --work /target/dir/ to try this.

Note that _work doesn’t have to be under - I hope this can provide some relief.

@fhammerl - unfortunately it doesn't help. Our biggest problem is that repo_name is used 2 times: /<repo_name>/<repo_name>/

Edit:

Actually it's not /<repo_name>/<repo_name>/ but /<PipelineDirectory>/<repo_name>/: https://github.com/actions/runner/blob/main/src/Runner.Worker/TrackingConfig.cs#L62 where PipelineDirectory is just repo_name.

@TingluoHuang I can't find anything about "Pipeline" concept in docs so it's extremely hard to understand consequences of any change in this area => it's hard to contribute to this repo in this area. Maybe you can just allow to override this variable in runner config?

I've tried to modify runner code and after this mod: WorkspaceDirectory = repoName;//Path.Combine(PipelineDirectory, repoName); it seems to work as expected.

ReneSorensenBaader commented 2 years ago

This is a serious matter, i have 60+ repos that does not build because of this issue. Mainly this comes down to msbuild do not support long path names. The fix @kraszkow metions, is what i am trying to implement at the moment, please make something happen...

MichaelJJ commented 1 year ago

Hello, any updates on this request? Would love to see some traction on this request...

StephenHodgson commented 1 year ago

I'm trying to figure out why the repo name is in there twice 🤔

kraszkow commented 1 year ago

I'm trying to figure out why the repo name is in there twice 🤔

From "src code" answer is here https://github.com/actions/runner/blob/main/src/Runner.Worker/TrackingConfig.cs#LL60-L62C76 But why it's needed - that is very good question 😄

StephenHodgson commented 1 year ago

I was just looking for that! thanks

StephenHodgson commented 1 year ago

I'm struggling to figure out what the difference between WorkspaceDirectory and PipelineDirectory.

It's mainly used in the PipelineDirectoryManager:

https://github.com/actions/runner/blob/9a228e52e9fb22857052cbb8a63e8f61f98f56a4/src/Runner.Worker/PipelineDirectoryManager.cs#L122

And when setting the runner.workspace & github.workspace contexts. https://github.com/actions/runner/blob/9a228e52e9fb22857052cbb8a63e8f61f98f56a4/src/Runner.Worker/JobExtension.cs#L175

StephenHodgson commented 1 year ago

Seems like it boils down to the PipelineConstants.WorkspaceCleanOptions when we're preparing the pipeline to run:

https://github.com/actions/runner/blob/9a228e52e9fb22857052cbb8a63e8f61f98f56a4/src/Runner.Worker/PipelineDirectoryManager.cs#L64

Key contexts here: (super confusing I know)

  1. If we clean all, then it will recreate both the pipeline and workspace directories. (Deletes Existing)
  2. If we clean only resources, it will recreate the WorkspaceDirectory for each tracked repository in trackingConfig.Repositories.
  3. If we clean outputs, then it deletes any untracked workspace directories that's not in the trackingConfig.Repositories (oddly enough it seems this collection is never updated so effectively it deletes all of the pipeline directories by default).
  4. Otherwise if no options are passed we just create new pipeline and workspace directories if they don't exist.

There's a lot of cleanup that can happen here it seems.

tbranch227 commented 1 year ago

3. Repositories

With 100s of repositories and using self-hosted runners, we run into the same issues. We actually maintain our old working folders for a few days, so that we can review tasks later for why things failed. Sometimes it takes a while for a pattern to emerge that let's us know we need to review past builds.

When you are building mobile and desktop apps, the process gets complicated quickly and you're upgrading tools every quarter to keep up with your target platforms.

Setting that folder at runtime. I thought the hook scripts might give me that opportunity to set a working folder that would be used by the action, but the environment variable is overwritten by the action or the action runner. I'm not sure, but the environment variable definitely gets reset.

We need to be able to do the following:

Thanks - for now - looks like we'll be using all custom actions.

jahales-intel commented 1 year ago

Someone from my organization pointed me to a workaround that they're using and I'm sharing here in case it helps while we wait for a robust fix.

There's a Pipeline configuration folder in the runner work directory where you can modify the pipeline directory and workspace directory for a given repository:

\PipelineMapping\\\PipelineFolder.json I've been modifying that file to create a shorter file path and it seems to work (pipelineDirectory / workspaceDirectory). I'm not aware of any documentation for this.
skharel146 commented 1 year ago

@jahales-intel its going to be tedious if you have a lot of repositories to create at once, isn't it?

jahales-intel commented 1 year ago

@jahales-intel its going to be tedious if you have a lot of repositories to create at once, isn't it?

This certainly isn't a great long term solution.

sergii-rybin-tfs commented 1 year ago

Hello, any updates on this request? We have the same problem with many repositories.

skharel146 commented 1 year ago

Same is the case for me, I wrote a PS script to read the PipelineMappings.json file and update the values.

interifter commented 1 year ago

Anyone know how to get attention on this issue?

Piedone commented 6 months ago

Some more context: https://github.com/actions/checkout/issues/955.

Danielku15 commented 6 months ago

Same problem here. I'm migrating many repositories in our GitHub enterprise instance from TeamCity to GitHub Actions and boom: Max Path exceeded for various repositories.

Form the discussion I understand its important to have separate folders.

It might be a bit short-sighted but why not simply do something like this:

https://github.com/actions/runner/blob/04b07b6675c56da40e532ab73eeb4287bf223e34/src/Runner.Worker/TrackingConfig.cs#L60-L62

            // Set the directories.
            PipelineDirectory = repoName.ToString(CultureInfo.InvariantCulture);
-            WorkspaceDirectory = Path.Combine(PipelineDirectory, repoName);
+            WorkspaceDirectory = Path.Combine(PipelineDirectory, "w"); // short sub-path for workspace

This still eats 2 characters from the path but its definitly better than now.

interifter commented 6 months ago

For those still running into this, and are using self-hosted runners with a custom bringup script, here is a python snippet we use as part of a runner bring-up

@dataclass
class PipelineMappingWorkarounder:
    """Windows-specific workaround for long path problems. Going hard on that class name"""

    agent_tmp_path: Path
    repo_name: str
    owner_group: str
    pipeline_dir: str = field(init=False, default="_w")
    work_dir: str = field(init=False, default="wrk")
    full_path: Path = field(init=False)

    def __post_init__(self) -> None:
        """Post init shenanigans"""
        # These _are_ indeed shenanigans. We should not need to mess with this file
        # And there is no guarantee of its existence in future actions/runner releases.
        self.full_path = self.agent_tmp_path / "_PipelineMapping" / self.owner_group / self.repo_name / "PipelineFolder.json"
        if self.full_path.is_file():
            with self.full_path.open("r", encoding="UTF8") as handle:
                content = handle.read()
            logger.debug(f"{self.full_path} exists! Content:\n{content}")

    def generate_dict(self) -> dict:
        """Generate the dictionary object for the PipelineFolder.json file"""
        logger.debug("Generating dict for PipelineFolder.json")
        return {
            "repositoryName": f"{self.owner_group}/{self.repo_name}",
            "pipelineDirectory": f"{self.pipeline_dir}",
            "workspaceDirectory": f"{self.pipeline_dir}\\{self.work_dir}",
            "repositories": {
                f"{self.owner_group}/{self.repo_name}": {
                    "repositoryPath": f"{self.pipeline_dir}\\{self.work_dir}",
                }
            },
        }

    def save_dict_as_json(self) -> None:
        """Saves the dict object as a JSON to the intended path"""
        target_path = self.full_path
        logger.debug(f"Saving JSON to {target_path}")
        target_path.parent.mkdir(parents=True, exist_ok=True)
        try:
            with target_path.open("w+", encoding="UTF8") as handle:
                json.dump(self.generate_dict(), handle, indent=4)
            logger.debug(f"Saved to {target_path}")
        except Exception as exc:
            logger.error(f"Failed to save {target_path}: {exc}")
            raise
milos-licina-3shape commented 5 months ago

We are also experiencing the same issue. The way we name the repositories and the way we structure the code coupled with the double nesting from GitHub leads to a long default root directory of our code. When we try to compile our code and run tests afterwards we are hitting limits with certain parts of the dotnet tooling not respecting long path registry settings in Windows.

We would really appreciate a supported and documented way of changing the Workspace directory, without reverse engineering GitHub logic. If you are worried on the impact of this on GitHub hosted runners, a self hosted only feature would be greatly appreciated.