ninenines / erlang.mk

A build tool for Erlang that just works.
https://erlang.mk
ISC License
579 stars 238 forks source link

UTF-8 issues for .app.src with sed on MacOS #929

Closed peffis closed 1 year ago

peffis commented 3 years ago

If a dependency's .app.src contains utf-8 characters, like for instance in this repo: https://github.com/Nordix/eredis/blob/1a9562b9e9874828d266dde0a53ba8bb278d15d9/src/eredis.app.src#L8

, the build will fail for me on MacOS.

A simple test case is to bootstrap a simple application with erlang.mk (also bootstrap-rel) and then add the one and only dependency:

DEPS = eredis
dep_eredis = git https://github.com/Nordix/eredis master

First, when building the dependency one will see a sed error:

 ERLC   basho_bench_driver_eredis.erl eredis.erl eredis_client.erl eredis_parser.erl eredis_sub.erl eredis_sub_client.erl
 APP    eredis.app.src
sed: RE error: illegal byte sequence

So sed on MacOS is not keen of the utf-8 characters.

Then, in the end when making the dependency it will say

===> Failed to solve release:
 Dependency eredis is specified as a dependency but is not reachable by the system.

This is because the deps/eredis/ebin/eredis.app will be corrupted, like this:

 {application,eredis,
             [{description,"Erlang Redis Client"},
              {vsn,"1.3.3"},
              {modules, ['basho_bench_driver_eredis','eredis','eredis_client','eredis_parser','eredis_sub','eredis_sub_client']},
              {registered,[]},
              {applications,[kernel,stdlib]},

(the file will end where the utf-8 characters were)

As I understand it this seems to be related to sed on MacOS and possibly in combination with setting of LC_CTYPE=UTF-8, so feel free to simply close this issue, but I just wanted to report it anyway as it might explain other issues people have when they simply report the "...is specified as a dependency but is not reachable by the system" message from relx . The workaround is to remove the UTF-8 characters from the maintainers list in this case. Perhaps there could be some other workaround with some other setting of LC_CTYPE and friends...but I have not figured that out yet.

essen commented 3 years ago

Good catch. I suppose you've tried with LC_TYPE=C?

peffis commented 3 years ago

Yes, some quick attempts, both setting it in the terminal and also adding it before sed command in erlang.mk, but it did not seem to help for me (although it could be simply because of having some stale data there from a previous build - so I will play around a bit more with this).

On 25 Apr 2021, at 22:05, Loïc Hoguin @.***> wrote:

Good catch. I suppose you've tried with LC_TYPE=C?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ninenines/erlang.mk/issues/929#issuecomment-826381938, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASRR4ARY7ADENQARSTWKFDTKRYXFANCNFSM43RTMLPQ.

peffis commented 3 years ago

I suppose another workaround is to install gnused on mac and have it replace the default sed.

essen commented 3 years ago

Yes but I would like this to work with the default sed as well (if possible). I can check during the week. cc @lhoguin

peffis commented 3 years ago

Adding "export LC_CTYPE = C" in the beginning of the Makefile (the one that includes erlang.mk) will remove the sed error from the building of the dep. And then the deps/eredis/ebin/eredis.app looks complete:

{application,eredis,
             [{description,"Erlang Redis Client"},
              {vsn,"1.3.3"},
              {modules, ['basho_bench_driver_eredis','eredis','eredis_client','eredis_parser','eredis_sub','eredis_sub_client']},
              {registered,[]},
              {applications,[kernel,stdlib]},
              {maintainers,["Bj<F6>rn Svensson","Viktor S<F6>derqvist"]},
              {licenses,["MIT"]}]}.

, but strangely I still get the

 ===> Failed to solve release:
 Dependency eredis is specified as a dependency but is not reachable by the system.

at the build of the release. So perhaps it is not only a sed issue, but perhaps also a relx issue? Doing the exact same build with the (in this case Swedish) characters removed will make the build succeed completely - no sed error and no relx error.

essen commented 3 years ago

Well <F6> is not a valid UTF-8 character so that's probably why it fails later. So LC_TYPE=C is not an appropriate solution. Wonder if we can make sed work on binary data rather than text to avoid those issues.

peffis commented 3 years ago

The <F6> is due to some conversion happening when I copied it (copied it from a session with "less" in terminal) The utf-8 characters are the same as in the .app.src.

{application,eredis,
             [{description,"Erlang Redis Client"},
              {vsn,"1.3.3"},
              {modules, ['basho_bench_driver_eredis','eredis','eredis_client','eredis_parser','eredi\
s_sub','eredis_sub_client']},
              {registered,[]},
              {applications,[kernel,stdlib]},
              {maintainers,["Björn Svensson","Viktor Söderqvist"]},
              {licenses,["MIT"]}]}.

, but even if this looks ok, the "make rel" will fail

essen commented 3 years ago

Really? What does diff eredis.app eredis.app.src say? If it's proper UTF-8 then the only difference should be the modules.

peffis commented 3 years ago
$ diff deps/eredis/ebin/eredis.app  deps/eredis/src/eredis.app.src
4c4
<               {modules, ['basho_bench_driver_eredis','eredis','eredis_client','eredis_parser','eredis_sub','eredis_sub_client']},
---
>               {modules,[]},

, so yes, it's only the modules.

essen commented 3 years ago

Try with V=1 I think it'll enable debug output for relx as well. I'm off.

peffis commented 3 years ago

Yes, with

V=1 make rel

I see relx first, at one point, saying:

===> Unable to load the application metadata from /Users/stefan/projects/my_app/deps/eredis

...and then, towards the end, it says:

===> Solving Release my_app_release-1
===> Provider (resolve_release) failed with: {error,
                                                     {rlx_prv_release,
                                                      {failed_solve,
                                                       {unreachable_package,
                                                        eredis}}}}
===> Failed to solve release:
 Dependency eredis is specified as a dependency but is not reachable by the system.

So it seems like these characters in the .app.src will also cause relx to not be able to load "application meta data" (in whatever way it is doing that). So it seems like utf-8 characters in .app.src causes problems in 1) The call to sed that erlang.mk is doing when building the dep and 2) The method relx is using for loading "application meta data"

essen commented 1 year ago

Related: https://github.com/Nordix/eredis/issues/39

Considering eredis no longer has an issue (confirmed via make check hp=eredis) I will be closing this and we can look at it when a new package breaks. Thanks!