Closed dmzimmerman closed 1 year ago
same here, any news?
dune.3.7.0 shows the same crash.
log.txt I renamed _build/log
so as the upload here works.
Just an update: I will try to take a look (probably won't happen before next week -- I need to procure an M1 mac).
I tried, but could not reproduce in a M2 Macbook Air with OCaml 4.14.1 and Dune 3.7.0. Unfortunately, it is impossible to investigate with a reproducible case.
By the way, there is another bug report involving M1 macs and the file watching mechanism in #6151. In principle they do not seem related, but I thought I would mention it just in case.
If you can set up remote access to a machine where the issue can be reproduced, then I can try to investigate, otherwise I don't have any other ideas to propose, sorry!
By the way, there is another bug report involving M1 macs and the file watching mechanism in https://github.com/ocaml/dune/issues/6151. In principle they do not seem related, but I thought I would mention it just in case.
It's not related. The bug appears on every mac and I can readily reproduce it. In fact, you've pointed out the bug to me in a private slack chat before and exactly what was causing it :)
If you can set up remote access
send me a ssh key to work@mro.name. At my office I am behind a NAT, so we have to schedule a time slot and I will be in a sensible network. But it will be slow. 12MBit downstream.
If you can set up remote access
send me a ssh key to work@mro.name. At my office I am behind a NAT, so we have to schedule a time slot and I will be in a sensible network. But it will be slow. 12MBit downstream.
Done, thanks. Let's see if we can coordinate and figure out what is going on...
By the way, there is another bug report involving M1 macs and the file watching mechanism in #6151. In principle they do not seem related, but I thought I would mention it just in case.
It's not related. The bug appears on every mac and I can readily reproduce it. In fact, you've pointed out the bug to me in a private slack chat before and exactly what was causing it :)
Right, thanks for the reminder :)
@nojb quite typical the crash is shy and doesn't show in a fresh account.
Also it seems to go away if I purify the messy PATH in my everyday account from
$ echo $PATH /Users/mro/.opam/4.12.1/bin:/Users/mro/bin:/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin:/Applications/Little Snitch.app/Contents/Components:/usr/local/go/bin:/usr/local/MacGPG2/bin:/opt/homebrew/opt/fzf/bin
to
$ echo $PATH /Users/mro/.opam/4.12.1/bin:/bin/:/usr/bin:/usr/local/bin/
If nothing jumps at you, give me some days of fiddling with the PATH to narrow it down or have a workaround.
Debugging via ssh access IMO doesn't make sense right now.
If nothing jumps at you, give me some days of fiddling with the PATH to narrow it down or have a workaround.
Nothing specific, but the dune watching mechanism will "watch" every directory in PATH
(since changes in these directories may affect binaries which are used in the Dune rules), so it sounds plausible that there may be a link here.
Debugging via ssh access IMO doesn't make sense right now.
OK! We can revisit that possibility in the future as needed. Thanks.
@nojb quite typical the crash is shy and doesn't show in a fresh account.
Also it seems to go away if I purify the messy PATH in my everyday account from
$ echo $PATH /Users/mro/.opam/4.12.1/bin:/Users/mro/bin:/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin:/Applications/Little Snitch.app/Contents/Components:/usr/local/go/bin:/usr/local/MacGPG2/bin:/opt/homebrew/opt/fzf/bin
to
$ echo $PATH /Users/mro/.opam/4.12.1/bin:/bin/:/usr/bin:/usr/local/bin/
If nothing jumps at you, give me some days of fiddling with the PATH to narrow it down or have a workaround.
Debugging via ssh access IMO doesn't make sense right now.
What is the path of the project that you're building with dune?
fish is my default shell. $ dune runtest --watch
comes up, but the first edit crashes dune.
But if I launch a new terminal, start a bash
, cd into my project, do eval $(opam env)
the $ dune runtest --watch
runs stable.
I can confirm that simplifying my $PATH
in the same way as @mro also makes the crash go away for me. The path to the project that I'm building is outside all of the directories in my $PATH
regardless of that simplification (it's just a folder a few levels deep in my home directory).
@dmzimmerman do you have any paths in $PATH
that don't exist?
I did... I'm also embarrassed to say that .
was somehow in my $PATH
(I have no idea what startup script put that in there, but it will be fixed momentarily). Removing .
fixed it; removing the paths in $PATH
that don't exist actually had no effect (I even tried putting one back after having removed .
, and it didn't cause the crash again).
not after cleaning, but, look at what opam env
brings:
$ opam env
set -gx OPAM_SWITCH_PREFIX '/Users/mro/.opam/4.12.1';
set -gx CAML_LD_LIBRARY_PATH '/Users/mro/.opam/4.12.1/lib/stublibs:/Users/mro/.opam/4.12.1/lib/ocaml/stublibs:/Users/mro/.opam/4.12.1/lib/ocaml';
set -gx OCAML_TOPLEVEL_PATH '/Users/mro/.opam/4.12.1/lib/toplevel';
set -gx PKG_CONFIG_PATH '/Users/mro/.opam/4.12.1/lib/pkgconfig';
set -gx PATH ':/Users/mro/.opam/4.12.1/bin' '/opt/homebrew/bin' '/bin' '/sbin' '/usr/bin' '/usr/sbin' '/usr/local/bin' '/usr/local/MacGPG2/bin' '/Users/mro/bin';
the leading colon in $PATH
sets the .
, obviously.
After setting $PATH
without .
- 🎉 No Crash!
$ echo $SHELL
/opt/homebrew/bin/fish
$ dune --version
3.7.0
$ opam --version
2.0.7
$ opam switch
# switch compiler description
4.10.2 ocaml-base-compiler.4.10.2 4.10.2
→ 4.12.1 ocaml-base-compiler.4.12.1 4.12.1
4.14.0 ocaml-base-compiler.4.14.0 4.14.0
5.0.0 ocaml-base-compiler.5.0.0 5.0.0
$ uname -mprsv
Darwin 20.6.0 Darwin Kernel Version 20.6.0: Fri Dec 16 00:34:59 PST 2022; root:xnu-7195.141.49~1/RELEASE_ARM64_T8101 arm64 arm
a workaround:
$ eval (opam env | sed -e "s/':/'/g;s/'\.'//g")
It is probably a good idea to robustify Dune by eg only watching absolute directories in $PATH
.
Should we close this?
I can add a patch that checks for dirs in PATH being absolute and then close.
It is probably a good idea to robustify Dune by eg only watching absolute directories in
$PATH
.
OK, sounds good!
Well not so sure anymore. Here is what we are doing:
let parse_path ?(sep = path_sep) s =
String.split s ~on:sep
|> List.filter_map ~f:(function
| "" -> None
| p -> Some (Path.of_filename_relative_to_initial_cwd p))
The assumption is that if there is a relative path in PATH then we treat it relative to the current working directory. Is this sane behaviour or should we safely enforce absolute paths?
I would like to understand why we have a crash if the path does not exist.
Is this sane behaviour or should we safely enforce absolute paths?
Independently of the present issue, I don't think treating it relative to the current working directory is a good idea; it makes the "watched set" of directories depend on where you are when you launch Dune. So I would filter out relative paths from PATH
. This is a heuristic of course, but I think it will be correct most of the time.
I would like to understand why we have a crash if the path does not exist.
Indeed, this sounds like it should be handled in the Fsevents bindings. As a data point, in the Windows file watching backend, the existence of the path is checked, and if it doesn't exist it is ignored. (To be clear, I haven't checked if the FSEvents bindings do this check or not, I am just guessing.) cc @rgrinberg
I've made a patch to ignore relative paths but I don't think we should close this issue until we sort out the possible issue with Fsevents.
I don't think it's a good idea to ignore relative paths in PATH. If they are interpreted by the rest of the system and not dune, that makes dune inconsistent and poorly behaved.
Do you mind sending a test for this issue first? I think it's just a matter of including CWD in PATH and trying to build.
@rgrinberg I am not able to reproduce on Linux. I did something along the lines of our watching tests:
Building with relative paths in the PATH env var.
$ . ./helpers.sh
$ cat > dune-project << EOF
> (lang dune 3.8)
> EOF
$ touch foo.opam
$ cat > a.ml << EOF
> let () = print_endline "Hello, world!"
> EOF
$ cat > new-contents << EOF
>
> let () = print_endline "Goodbye, world!"
> EOF
$ cat > dune << EOF
> (library
> (name a)
> (public_name foo.a))
> EOF
$ export PATH=./doesnt-exist/:$PATH
$ start_dune
$ build foo.install
Success
$ cat new-contents > b.ml
$ cat > a.ml << EOF
> include B
>
> let () = print_endline "Hello, world!"
> EOF
$ build foo.install
Success
$ with_timeout dune shutdown
$ cat .#dune-output
Success, waiting for filesystem changes...
Success, waiting for filesystem changes...
$ dune exec -- ./a.exe
If I understood the issue correctly, the second build should fail.
@dmzimmerman If you are able to make a cram test that you can reproduce on apple silicon that would be much appreciated. It would help us understand what the real issue is here.
@Alizter in my case it was unrelated to apple silicon but rather to the leading colon in the fish $PATH
with my 3-years old opam version.
So IMO close
@mro I see. The leading colon in the PATH variable is still something we ought to be resilient against however.
I'm not exactly sure how to make a cram test for this, because it requires environment variables to be set in a specific way, so it'd be more like a shell script... but boiling it down to its essence, if I run, in some directory:
git clone https://github.com/OCamlPro/ocaml-solidity
cd ocaml-solidity
export PATH=/opt/homebrew/bin:/usr/bin:.
eval $(opam env)
dune build --watch
the crash happens every time. This is with OPAM installed via Homebrew, and my default switch (4.14.1), but it doesn't matter which switch I use, it still crashes. The choice of ocaml-solidity
isn't anything special, it just happens to be an OCaml project that I'm working with that has a dune configuration; it happens with every other project I've tried.
Expected Behavior
I expected it to watch the filesystem and rebuild as necessary, without crashes.
Actual Behavior
It crashes every time the filesystem changes.
Reproduction
Start with any Dune project (I happened to use https://github.com/OCamlPro/ocaml-solidity). Open two terminals and change working directory in both to the root directory of the project. Run
dune build
, thendune build --watch
in one (the reason for the initialbuild
run is that if the project needs building,dune build --watch
crashes immediately with the same error as you'll see on a filesystem change). In the other, change the filesystem - either by editing one of the files, or even something liketouch foo
to create a new file that Dune wouldn't even need to build. It should crash immediately with the messageInternal error, please report upstream including the contents of _build/log
and a stack trace (which you can see in the verbose gist below) that seems to indicate anas_outside_build_dir_exn
being thrown.Specifications
dune
(output ofdune --version
): 3.6.1ocaml
(output ofocamlc --version
): 4.14.1Additional information
I replicated this on completely clean installs of OCaml 4.14.1 and Dune via OPAM (i.e.,
rm -rf ~/.opam; opam init; opam switch create 4.14.1
followed byopam
installation of Dune and the required libraries for the project I was building, on both a MacBook Pro/M1 Pro/32GB RAM and a Mac Studio/M1 Ultra/128GB RAM. Colleagues of mine who have tried the same on 16GB M1 or M1 Pro machines do not seem to be experiencing this issue, but I don't know if that's just a coincidence or something meaningful.dune
with the--verbose
flag): https://gist.github.com/dmzimmerman/89f37f639c1fe4ce7e5ee00dc6e734dd_build/log
: https://gist.github.com/dmzimmerman/0c533e90be627946637746b321874672