erlang / rebar3

Erlang build tool that makes it easy to compile and test Erlang applications and releases.
http://www.rebar3.org
Apache License 2.0
1.69k stars 516 forks source link

Compile command causes nesting in _build dir, while copying project directories. #2492

Open dannmartens opened 3 years ago

dannmartens commented 3 years ago

Background

An application running with rebar3 shell failed to find the files it needed from the /priv directory.

Environment

-----------------
Operating System: x86_64-pc-linux-gnu
ERTS: Erlang/OTP 23 [erts-11.1.7] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:1]
Root Directory: /usr/lib/erlang
Library directory: /usr/lib/erlang/lib
-----------------
Loaded Applications:
bbmustache: 1.10.0
certifi: 2.5.2
cf: 0.3.1
common_test: 1.19.1
compiler: 7.6.6
crypto: 4.8.3
cth_readable: 1.4.9
dialyzer: 4.3
edoc: 0.12
erlware_commons: 1.3.1
eunit: 2.6
eunit_formatters: 0.5.0
getopt: 1.0.1
hipe: 4.0.1
inets: 7.3.1
kernel: 7.2
providers: 1.8.1
public_key: 1.9.2
relx: 4.2.0
sasl: 4.0.1
snmp: 5.7.2
ssl_verify_fun: 1.1.6
stdlib: 3.14
syntax_tools: 2.4
tools: 3.4.2

-----------------
Escript path: /usr/bin/rebar3
Providers:
  app_discovery as clean compile compile cover ct deps dialyzer do edoc escriptize eunit get-deps help install install_deps list lock new path pkgs release relup report repos shell 
state tar tree unlock update upgrade upgrade upgrade version xref

rebar 3.14.3 on Erlang/OTP 23 Erts 11.1.7

Current behaviour

The compile command nests some directories of the project layout in the _build directory:

image

Expected behaviour

The compile command should just copy the contents of those directories.

Investigation

While investigating this issue, I came across https://github.com/erlang/rebar3/issues/1934

That issue linked to the source code, where I found a link to the code responsible for copying: image

The cp process step does not appear to use the required arguments to use the contents of the source dir only, when copying, as described here: How can I copy the contents of a folder to another folder in a different directory using terminal?

If I am not mistaken, right now, it does (simplified):

cp -Rp /src /_build/src

which results in:

/_build/src/src

when it probably should be:

cp -Rp /src/. /_build/src

I am working in a virtualized environment, with mapped folders, a set-up which has been considered before as a possible cause for this type of issue, but I can reproduce this on both the mapped folder, as well as on the native Linux filesystem.

dannmartens commented 3 years ago

I ran a test on an Ubuntu 20.04 VM, where I intercepted cp calls from rebar3, in order to add "/." to the source path and that seems to fix it.

However, as there are different cp implementations out there, and as they behave differently in this case, a patch might break things for other users.

image

From MacOS vs Linux — the cp command will trip you up!

dannmartens commented 3 years ago

It would appear that the code in this cp_r function will always malfunction with directories on Linux, as it ensures that the directory exists before performing the copy. In that case, you always will end up with directories nested as destination/source.

From src/rebar_file_utils.erl:

    case os:type() of
        {unix, Os} ->
            EscSources = [rebar_utils:escape_chars(Src) || Src <- Sources],
            SourceStr = rebar_string:join(EscSources, " "),
            % On darwin the following cp command will cp everything inside
            % target vs target and everything inside, so we chop the last char
            % off if it is a '/'
            Source = case {Os == darwin, lists:last(SourceStr) == $/} of
                {true, true} ->
                    rebar_string:trim(SourceStr, trailing, "/");
                {true, false} ->
                    SourceStr;
                {false, _} ->
                    SourceStr
            end,
            % ensure destination exists before copying files into it
            {ok, []} = rebar_utils:sh(?FMT("mkdir -p ~ts",
                           [rebar_utils:escape_chars(Dest)]),
                      [{use_stdout, false}, abort_on_error]),
            {ok, []} = rebar_utils:sh(?FMT("cp -Rp ~ts \"~ts\"",
                                           [Source, rebar_utils:escape_double_quotes(Dest)]),
                                      [{use_stdout, true}, abort_on_error]),
            ok;
tsloughter commented 3 years ago

Not sure how that is possible, I run rebar3 only on linux and have not seen this issue (not to mention the tests that run and so many other users).

Are you configuring anything about directories or is this what you are seeing by default when running a compile?

dannmartens commented 3 years ago

Well, I am puzzled as to how this could be something isolated to my setup.

It's a very simple project, without any directory-related configuration.

This is what "rebar3 compile" sends to the os, when I intercept the calls:

cp -Rp /vagrant/priv /vagrant/_build/default/lib/myapp/priv
cp -Rp /vagrant/include /vagrant/_build/default/lib/myapp/include
cp -Rp /vagrant/src /vagrant/_build/default/lib/myapp/src

These commands would result in destination/source directory nesting, as documented for GNU cp behaviour, if the destination directory already exists. And rebar3 ensures it does.

There is no black magic, here.

tsloughter commented 3 years ago

Oh, hm, you are running in a vagrant VM? I don't know why or if that is the issue, but that could be related.

dannmartens commented 3 years ago

The code only has an exception for OSX as "darwin."

It is clear what rebar3 calls for, and that is not the correct command to copy the contents of a directory into another directory, for GNU cp.

tsloughter commented 3 years ago

Are directories like /vagrant/src links themselves? Or anything else "different' about them than being regular directories.

dannmartens commented 3 years ago

While I agree that any exoticism of an underlying file system can introduce weird behaviour, I see nothing irregular here.

The copy command rebar3 constructs simply cannot work on a system with GNU cp.

It can only work if the source path gets appended properly and I don't see that happening anywhere in the code.

If the intent is to copy the contents of src into dst: cp -Rp src dst does not work, cp -Rp src/ dst does not work, cp -Rp src/* dst works, cp -Rp src/. dst works,

Which src path do you see contructed on your setups?

On any system with GNU cp, anywhere on the file system where you have access, try:

cd /tmp
mkdir priv
touch priv/file
mkdir -p _build/priv
cp -Rp priv _build/priv

And you end up with _build/priv/priv

This is not the case with BSD cp, which does not work in the same way; even some options are different.

tsloughter commented 3 years ago

But is /vagrant/src a link?

I'd also forgotten that it should be doing symlinks and not copying. So it could be that copying is broken but have never noticed because on my systems, and most, a symlink is created for _build/default/lib/<app>/priv and not a copy.

dannmartens commented 3 years ago

No, it's not a link. It's a directory open to any and all.

There is definitely a clue in that cp gets called, if, as you say, it would have tried to create a symlink first!

dannmartens commented 3 years ago

I just checked that and that fails:

ln: failed to create symbolic link 'test/priv': Operation not supported

I guess that's why the cp is attempted afterwards.

Let me see if I can find a proper patch for this situation.

tsloughter commented 3 years ago

Ok, cool, so yea now I understand, this is a bug that hasn't shown itself before because linux systems usually use links and your is for some reason falling back on copying. So two issues, the bug in how our copying function works under linux and why links are failing.

dannmartens commented 3 years ago

Yes, the underlying host os only allows symlinks on its file system when the guest runs with elevated privileges, which is not the case.

I will send a pull request with a proposal to mend rebar_file_utils:cp_r.

I must admit, I've been looking at that module all day, and I am not quite sure what it is trying to do: it takes in a list of source paths, but only looks at the last one to remove a trailing slash, if it is on OSX.

Perhaps it would be better to take a look at each source path separately, check if it is a directory and append or trim according to what the os needs.

Let me see what I can do.

ferd commented 3 years ago

The cp utility works differently on OSX and the behaviour was shown to work when these operations were taking place. Windows is another thing. The entire set of filesystem operations are extremely annoying to get working right, and yeah, the fact that the symlink failed hid failures way longer than it otherwise would have.

dannmartens commented 3 years ago

Not an issue, I will look into it and will try to find the best solution to amend this.

Pull request coming for you to evaluate.

dannmartens commented 3 years ago

So far, I've managed to make all the tests pass, except one, which crashes completely:

%%% rebar_escriptize_SUITE: ...
%%% rebar_escriptize_SUITE ==> escriptize_with_ebin_subdir: FAILED
%%% rebar_escriptize_SUITE ==> {thrown,rebar_abort}

Since some calls to cp_r exploit this directory nesting as a side effect, I still haven't cleared all the hurdles.