datalad / git-annex

A non-official clone of git-annex established for DataLad purposes. No PRs will be merged, but could be used to test perspective git-annex patches. Official git-annex repository: https://git.kitenet.net/index.cgi/git-annex.git/
16 stars 3 forks source link

Re-enable Datalad SSH tests on macOS #55

Open jwodder opened 3 years ago

jwodder commented 3 years ago

Blocker: https://github.com/datalad/datalad/pull/5417

yarikoptic commented 3 years ago

everything was merged and released on datalad end since back then. Could you please re-trigger CI run here @jwodder ?

jwodder commented 3 years ago

@yarikoptic CI run triggered.

yarikoptic commented 3 years ago

Mac is still ain't happy:

(default) Waiting for an IP...
Error creating machine: Error in driver during machine creation: Too many retries waiting for SSH to be available.  Last error: Maximum number of retries (60) exceeded
Error: Process completed with exit code 1.
yarikoptic commented 3 years ago

ha -- some pass and some fail with e.g.

2021-03-16T21:55:20.5835210Z datalad.support.exceptions.CommandError: CommandError: 'ssh -o ControlPath=/Users/runner/Library/Caches/datalad/sockets/0ead11a7 datalad-test 'export "PATH=/usr/lib/git-annex.linux:$PATH"; mkdir -p /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/datalad_temp_check_target_ssh_recursivefimncw8k-False'' failed with exitcode 1 [err: 'mkdir: cannot create directory ‘/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/datalad_temp_check_target_ssh_recursivefimncw8k-False’: Permission denied']

is it the same "too long of a path" issue?

jwodder commented 3 years ago

@yarikoptic What "too long of a path" issue? The only such issue I recall on macOS affected Conda's decisions about filling in shebangs.

yarikoptic commented 3 years ago

argh, failed to find related discussion ATM. But you could try meanwhile setting TMPDIR=~/DLTMP and see if goes away. That is what is done also in https://github.com/datalad/datalad/blob/master/.appveyor.yml#L228 and I believe for that reason

yarikoptic commented 3 years ago

Underlying issue https://unix.stackexchange.com/questions/367008/why-is-socket-path-length-limited-to-a-hundred-chars#:~:text=Mac%20OS%20X%2010.9%3A%20104%20characters maximal socket path length 104 . In Datalad were use HOME in TMPDIR while testing

yarikoptic commented 3 years ago

From error messages it seems like /Users/runner/DLTMP is not mount-bound inside the docker container and thus leading to various issues? If there is /tmp on those Macs, might be worth trying to export TMPDIR to e.g. /tmp/DLTMP since /tmp should exist in the container and more likely to work?

yarikoptic commented 3 years ago

58 supersedes this one, right @jwodder . If yes -- please close

jwodder commented 3 years ago

@yarikoptic #58 uses a third-party action for setting up Docker on macOS as an alternative to the Docker Machine approach on this branch. I'm not entirely certain how reliable the action in question is, and so I want to leave both PRs open for now.

yarikoptic commented 2 years ago

blocker was resolved, master of datalad should be green again, time to resolve this issue one way or another to gain better testing on OSX

jwodder commented 2 years ago

@yarikoptic This PR seems to work now, aside from some datalad test failures.

yarikoptic commented 2 years ago

@yarikoptic This PR seems to work now, aside from some datalad test failures.

well, it doesn't work in a sense that ssh related tests fail on macOS:

(git)smaug:/mnt/datasets/datalad/ci/git-annex/builds/2022/04[master]pr-55
$> git grep datalad.support.tests.test_annexrepo.test_annex_ssh
build-macos.yaml-645-32886238-failed/1_test-datalad (master).txt:2022-04-05T17:36:03.2718890Z datalad.support.tests.test_annexrepo.test_annex_ssh ... ERROR
build-macos.yaml-645-32886238-failed/1_test-datalad (master).txt:2022-04-05T17:54:00.1381870Z ERROR: datalad.support.tests.test_annexrepo.test_annex_ssh
build-macos.yaml-645-32886238-failed/2_test-datalad (maint).txt:2022-04-05T17:43:28.9826100Z datalad.support.tests.test_annexrepo.test_annex_ssh ... ERROR
build-macos.yaml-645-32886238-failed/3_test-datalad (release).txt:2022-04-05T17:47:29.4295650Z datalad.support.tests.test_annexrepo.test_annex_ssh ... ERROR
build-macos.yaml-645-32886238-failed/test-datalad (maint)/12_Run datalad tests.txt:2022-04-05T17:43:28.9826060Z datalad.support.tests.test_annexrepo.test_annex_ssh ... ERROR
build-macos.yaml-645-32886238-failed/test-datalad (master)/12_Run datalad tests.txt:2022-04-05T17:36:03.2718850Z datalad.support.tests.test_annexrepo.test_annex_ssh ... ERROR
build-macos.yaml-645-32886238-failed/test-datalad (master)/12_Run datalad tests.txt:2022-04-05T17:54:00.1381870Z ERROR: datalad.support.tests.test_annexrepo.test_annex_ssh
build-macos.yaml-645-32886238-failed/test-datalad (release)/12_Run datalad tests.txt:2022-04-05T17:47:29.4295600Z datalad.support.tests.test_annexrepo.test_annex_ssh ... ERROR
and a sample ERROR ```shell 2022-04-05T17:54:00.1381720Z ====================================================================== 2022-04-05T17:54:00.1381870Z ERROR: datalad.support.tests.test_annexrepo.test_annex_ssh 2022-04-05T17:54:00.1382130Z ---------------------------------------------------------------------- 2022-04-05T17:54:00.1382240Z Traceback (most recent call last): 2022-04-05T17:54:00.1382640Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/nose/case.py", line 198, in runTest 2022-04-05T17:54:00.1382740Z self.test(*self.arg) 2022-04-05T17:54:00.1383170Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/tests/utils.py", line 288, in _wrap_skip_ssh 2022-04-05T17:54:00.1383270Z return func(*args, **kwargs) 2022-04-05T17:54:00.1383740Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/tests/utils.py", line 307, in _wrap_skip_nomultiplex_ssh 2022-04-05T17:54:00.1383850Z return func(*args, **kwargs) 2022-04-05T17:54:00.1384290Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/tests/utils.py", line 874, in _wrap_with_tempfile 2022-04-05T17:54:00.1384410Z return t(*(arg + (filename,)), **kw) 2022-04-05T17:54:00.1385030Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/support/tests/test_annexrepo.py", line 1223, in test_annex_ssh 2022-04-05T17:54:00.1385270Z ar.copy_to(["foo"], remote="ssh-remote-1") 2022-04-05T17:54:00.1385730Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/support/gitrepo.py", line 325, in _wrap_normalize_paths 2022-04-05T17:54:00.1385860Z result = func(self, files_new, *args, **kwargs) 2022-04-05T17:54:00.1386300Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/support/annexrepo.py", line 2902, in copy_to 2022-04-05T17:54:00.1386520Z files, ['--in', '.', '--not', '--in', remote]) 2022-04-05T17:54:00.1386980Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/support/annexrepo.py", line 1514, in _get_expected_files 2022-04-05T17:54:00.1387100Z merge_annex_branches=merge_annex_branches 2022-04-05T17:54:00.1387930Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/support/annexrepo.py", line 1078, in _call_annex_records 2022-04-05T17:54:00.1388040Z raise e 2022-04-05T17:54:00.1388570Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/support/annexrepo.py", line 1050, in _call_annex_records 2022-04-05T17:54:00.1388650Z **kwargs, 2022-04-05T17:54:00.1389360Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/support/annexrepo.py", line 943, in _call_annex 2022-04-05T17:54:00.1389450Z **kwargs) 2022-04-05T17:54:00.1390700Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/runner/gitrunner.py", line 227, in run_on_filelist_chunks 2022-04-05T17:54:00.1390830Z **kwargs): 2022-04-05T17:54:00.1391330Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/runner/gitrunner.py", line 161, in _get_chunked_results 2022-04-05T17:54:00.1391410Z **kwargs) 2022-04-05T17:54:00.1391820Z File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/runner/runner.py", line 205, in run 2022-04-05T17:54:00.1391910Z **results, 2022-04-05T17:54:00.1392750Z datalad.runner.exception.CommandError: CommandError: 'git -c diff.ignoreSubmodules=none annex find --in . --not --in ssh-remote-1 --json --json-error-messages -c annex.dotfiles=true -- foo' failed with exitcode 1 under /private/tmp/DLTMP/datalad_temp_test_annex_ssh1x6hktex/main [info keys: stdout_json] [err: 'Unable to parse git config from ssh-remote-1 2022-04-05T17:54:00.1393350Z fatal: '/private/tmp/DLTMP/datalad_temp_test_annex_ssh1x6hktex/remote1' does not appear to be a git repository 2022-04-05T17:54:00.1393930Z CommandError: 'ssh -o ControlPath=/Users/runner/Library/Caches/datalad/sockets/89d769bb -o SendEnv=GIT_PROTOCOL datalad-test 'git-upload-pack '"'"'/private/tmp/DLTMP/datalad_temp_test_annex_ssh1x6hktex/remote1'"'"''' failed with exitcode 128 2022-04-05T17:54:00.1394060Z fatal: Could not read from remote repository. 2022-04-05T17:54:00.1394070Z 2022-04-05T17:54:00.1394210Z Please make sure you have the correct access rights 2022-04-05T17:54:00.1394310Z and the repository exists. 2022-04-05T17:54:00.1394610Z git-annex: cannot determine uuid for ssh-remote-1 (perhaps you need to run "git annex sync"?)'] ```
yarikoptic commented 2 years ago

having said that:

2022-04-05T18:05:52.7696010Z datalad.runner.exception.CommandError: CommandError: 'ssh -o ControlPath=/private/tmp/DLTMP/datalad_temp_1ei94vxf/Library/Caches/datalad/sockets/a658d1c0 datalad-test 'export "PATH=/usr/lib/git-annex.linux:$PATH"; mkdir -p /private/tmp/DLTMP/datalad_temp_check_exists_interactivenpqq6gao/sibling'' failed with exitcode 1 [err: 'mkdir: cannot create directory ‘/private/tmp’: Permission denied']
jwodder commented 2 years ago

@yarikoptic Regarding the TMPDIR issue, the problem seems to be that Datalad is trying to run an SSH command that runs mkdir -p /private/tmp/DLTMP/datalad_temp_check_exists_interactivenpqq6gao/sibling on the remote host, where /private/tmp is a macOS-specific path, but the SSH container is running Ubuntu.

yarikoptic commented 2 years ago

@yarikoptic Regarding the TMPDIR issue, the problem seems to be that Datalad is trying to run an SSH command that runs mkdir -p /private/tmp/DLTMP/datalad_temp_check_exists_interactivenpqq6gao/sibling on the remote host, where /private/tmp is a macOS-specific path, but the SSH container is running Ubuntu.

hm, I wondered how it works e.g. in mac tests in appveyor of stock datalad -- oh well, https://github.com/datalad/datalad/blob/master/.appveyor.yml#L274 , that is how

  # we place the "unix" one into the user's HOME to avoid git-annex issues on MacOSX
  # gh-5291
  - sh: mkdir ~/DLTMP
  # and use that scratch space to get short paths in test repos
  # (avoiding length-limits as much as possible)
  - cmd: "set TMP=C:\\DLTMP"
  - cmd: "set TEMP=C:\\DLTMP"
  - sh: export TMPDIR=~/DLTMP

so may be do the same here for OSX?

jwodder commented 2 years ago

@yarikoptic This PR already sets TMPDIR=/private/tmp/DLTMP. The problem is that DataLad is expecting the TMPDIR it its environment to be a valid TMPDIR in the environment that it's SSHing into.

yarikoptic commented 2 years ago

@yarikoptic This PR already sets TMPDIR=/private/tmp/DLTMP. The problem is that DataLad is expecting the TMPDIR it its environment to be a valid TMPDIR in the environment that it's SSHing into.

rright, that is why as a workaround appveyor setup sets it to a path which should be present in both environments, i.e. ~/DLTMP. In the long(er) run I guess it should sense the path to be used on the remote via remote mktemp execution first I guess. Filed a dedicated https://github.com/datalad/datalad/issues/6622 for that. But since unlikely it to get into imminent 0.16.0, let's do a workaround for now?

jwodder commented 2 years ago

@yarikoptic If the workaround you mean is to set TMPDIR to ~/DLTMP, that was tried previously; I suspect I had to change it because the path to the local $HOME does not exist inside the SSH container.

yarikoptic commented 2 years ago

try exactly as ~/DLTMP instead of using env var $HOME and thus possibly expanding it into original path on the host machine!? may be magic exists and it would work somehow? ;)

jwodder commented 2 years ago

@yarikoptic It appears that magic does not exist.

yarikoptic commented 2 years ago

But it is interesting how it fails right in fixture here

File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/__init__.py", line [26](https://github.com/datalad/git-annex/runs/5890452736?check_suite_focus=true#step:12:26)5, in setup_package
    _, cfg_file = prep_tmphome()
  File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/__init__.py", line 242, in prep_tmphome
    with make_tempfile(mkdir=True) as new_home:
  File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/site-packages/datalad/utils.py", line 1874, in make_tempfile
    True: tempfile.mkdtemp}[mkdir](**tkwargs_)
  File "/Users/runner/hostedtoolcache/Python/3.7.12/x64/lib/python3.7/tempfile.py", line 366, in mkdtemp
    _os.mkdir(file, 0o700)
FileNotFoundError: [Errno 2] No such file or directory: '~/DLTMP/datalad_temp_8nl769bw'

and doesn't fail similarly in stock datalad somehow...

jwodder commented 2 years ago

Blocked by https://github.com/datalad/datalad/issues/6622