mesonbuild / meson

The Meson Build System
http://mesonbuild.com
Apache License 2.0
5.49k stars 1.59k forks source link

custom_target replaces back-slashes with slashes in the command arguments #1564

Open keszybz opened 7 years ago

keszybz commented 7 years ago

Not sure about the details, but at least \t becomes /t and \\ becomes \:

project('meson3', 'c')

printf = find_program('printf')
custom_target('foo',
          output : 'foo',
          command : [printf, 'a\n\tb\nccc\n'],
          capture : true)

custom_target('foo2',
          output : 'foo2',
          command : [printf, 'a\\n\\tb\\nccc\n'],
          capture : true)
$ ninja-build && tail foo foo2
==> foo <==
a
/tb
ccc

==> foo2 <==
a
/tb
ccc

This makes it quite hard to write regexps and printf lines and such.

1512 is probably related.

keszybz commented 7 years ago

... or maybe meson should do a single layer of unescaping by itself, like python, so that it's possible to embed newlines and tabs in strings. Meaning that if I want to pass "\t\n\" to the program, I should write "\\t\\n\\".

QuLogic commented 7 years ago

See #737.

jpakkane commented 7 years ago

As the link indicates this is a thorny problem because we need to support Windows natively. Sadly there does not seem to be a way to pass all possible characters through both shell and cmd.exe (and other backends) that would work reliably.

The recommended way is to instead write your pipelines in standalone scripts and running those from Meson. Granted this is some amount of work beforehand but it is beneficial in several ways:

nirbheek commented 7 years ago

Honestly the piece of code that does that is wrong and we should remove it, but we just need to figure out why GStreamer needs it on Windows and fix whatever is broken in that regard.

Some of that will be on the GStreamer end, but I suspect some of that is bugs in Meson too.

jpakkane commented 7 years ago

To clarify: yes, we should pass all special characters through all shells and systems properly. Not doing that is a bug. However in practice this is is a really thorny problem and it is unlikely to ever work 100% reliably. Standalone scripts that are just invoked are always reliable, can be tested in isolation and are easy to convert to a different language (e.g. from Awk to Python or what have you) should the need arise.

johandc commented 7 years ago

This is also a problem when using the redirect symbol on Unix >.

Here is an example: custom_target('lss', output: lss_name, command: [find_program('arm-none-eabi-objdump'), '-h', '-S', exec.full_path(), '>', lss_name], depends: exec, build_by_default: true)

This will call arm-none-eabi-gcc with quotes around the redirect symbol '>'.

/usr/bin/arm-none-eabi-objdump -h -S /xx/yy/zz '>' zz.lss

jpakkane commented 7 years ago

You should use the capture keyword instead.

jpakkane commented 7 years ago

Also, don't do exec.full_path() and depends : exec. Just put the exec object in the command list and Meson will take care of everything for you.

nirbheek commented 6 years ago

I think we should try removing that snippet and see what breaks. It will only get harder to remove it as time goes on.

nirbheek commented 6 years ago

I'll get to the bottom of why this was needed in gstreamer this week. FTR, the first step is setting up cerbero which is a massive yak-shave unless you've done it before.

rulatir commented 4 years ago

The recommended way is to instead write your pipelines in standalone scripts and running those from Meson.

So, I already knew Meson required wrappers to be written:

Now this list is extended with:

This is outright ridiculous. I think Meson should come with a BIG BOLD CAPITALIZED FLASHING SIDE SCROLLING WARNING that using it requires writing TONS of code, and it can only possibly make sense for your project if the amount of code you WILL have to write (and debug, and test) in order to accommodate all the quirks and deficiencies and opinionatednesses of Meson is less than say 15% of the actual codebase you want to build with it.

Jehan commented 2 years ago

Is there any news on this? I encountered this issue today trying to fix some rule in GIMP build system and that was a pretty bad surprise. I also agree with everyone else that meson should not play with arguments, transforming them or whatnot. Or at least there should be a way to inform meson that a given string is not a path and should not be transformed.

I don't know, maybe like:

command: [ SED, notfile('s/\\(some\\)/\\1 regexp/'), '/this/is/a/path' ]

… or something! (don't take my notfile() naming proposition as anything else than a demo; name it whatever you want; naming is hard 😜)

Then when a command sees a notfile object, it doesn't try to transform it and uses the contained string as-is in the command.

Of course I could propose the opposite way, i.e. that only using the explicit already-existing file object (e.g. as returned by files() function) should be path-transformed, but I can imagine that there are build scripts out there which already take advantage of this string transformation you are doing right now, without ever realizing it (and they would break by such change). That's why, sure, if you want, make it the default to consider that any command line argument is possibly a file path, but then give us a way to also pass non-path strings.

Because adding yet-another external script is really not a sane solution. Our build system will end up containing more small-step build scripts than source files! 😆

Jehan commented 2 years ago

P.S.: by the way, I don't believe what I propose above would have been the ideal implementation. The ideal implementation was the other way around: what is a file vs what is not should have been explicit. So for instance, if we want to benefit from path separator transformation, we should have called (using again my example):

command: [ SED, 's/\\(some\\)/\\1 regexp/', files('/this/is/a/path') ]

And what's not explicitly marked should then be used as-is (because it's not the build-tool's role to make assumption on what random strings are supposed to be, without any kind of semantic tagging).

But since I completely understand that, now that it's already working the other way around, changing it the right way could cause a lot of backward compatibility problems (a.k.a. various builds which used to work would fail on some OSes, e.g. Windows, after a meson update and would need to be fixed with explicit files() marking), this is why I propose a fix the other way around (with a notfile() or whatever you want to call it). I don't find it ideal, just the best way out of it without breaking existing builds relying on current behavior.

eli-schwartz commented 2 years ago

Really, we already do know what's a file and what isn't. The problem is explained in more detail in https://github.com/mesonbuild/meson/pull/4393#issuecomment-431514921

Incidentally, I'm not sure how anyone could be relying on this but if so I think they are pretty broken anyway. It should only occur at all for manually specified Windows-style paths embedded in custom_target commands that don't specify those paths as input: or otherwise specify a files object.

Jehan commented 2 years ago

Incidentally, I'm not sure how anyone could be relying on this but if so I think they are pretty broken anyway.

Sure I completely agree. I am more in the school of developers who prefer to be type-explicit. But since meson already does this automatic string transformation, I assumed this was originally because the original implementer thought it'd be great if meson could fix paths (assuming everything is a path so far) for you. So it's obvious that some build systems are relying on this behavior without even knowing (i.e. just because it didn't break on several OS with different path separators so they didn't pay attention).

But if you want to oversee this, sure. Meson should not have tried to transform strings from the start, when it doesn't know what it is supposed to represent (a file path or something else). So this should/could be considered a bug. I was just careful in case the reason why you were not fixing this yet was to avoid backward compatibility.

The problem is explained in more detail in https://github.com/mesonbuild/meson/pull/4393#issuecomment-431514921

Hmmm… I read this comment and I didn't understand. @nirbheek says:

The basic problem is that instead of using PurePath.as_posix() consistently everywhere, we're using os.path.join

But since we already agreed that the problem (at least the main one, there may be others happening on path arguments, but that would be for other bug reports) is that meson seems to consider non-path arguments as paths, why would you use PurePath.as_posix() either? Just use nothing. Just don't transform the non-file arguments. If I write an argument in my command line argument list, I want it to be used as-is in my command line.

Then do whichever path conversion fits on the file objects or on arguments which are taken from the input: argument. If this part is bugged as well, it can be fixed too; but it's another problem from the one described here.

eli-schwartz commented 2 years ago

meson seems to consider non-path arguments as paths, why would you use PurePath.as_posix() either? Just use nothing. Just don't transform the non-file arguments. If I write an argument in my command line argument list, I want it to be used as-is in my command line.

My assumption is because the "quick hack" to work around os.path.join is where at that point, the command line has been reduced to a contextless string and the quickest way to fix gst-build while breaking all other software was to do a final find/replace in a single place on the entire command while writing it out to build.ninja, because clearly that's a much better solution than handling each "explicitly a path argument" case.

Again, the problem here really just is "Meson is doing something it itself acknowledges as both technically incorrect and a cause of bugs".

Sadly, we cannot fix it, because if we do then gst-build has unspecified bugs where the only reproducer is, purportedly,

Unfortunately we have not been able to come up with an isolated test case for this so unless you manage to come up with one, the only way is to test the building with Gst's setup.

(I don't understand why gst-build is special here, but there you have it.)

nirbheek commented 2 years ago

Sadly, we cannot fix it, because if we do then gst-build has unspecified bugs where the only reproducer is, purportedly,

Honestly, I have no problems fixing gst-build (now gstreamer monorepo) once we figure out what's happening here. This only really breaks on Windows, and building directly with meson is something only developers are expected to do on Windows, so we can fix it in the main branch + last stable branch and be done with it. We don't care about release tarballs.

eli-schwartz commented 2 years ago

Isn't "once we figure out what's happening" the big issue though? :p

xclaesse commented 2 years ago

I think we should remove that hack from meson and wait to see what are the fallouts.

nirbheek commented 2 years ago

Isn't "once we figure out what's happening" the big issue though? :p

What I mean is that I am not too concerned about gstreamer-specific breakage because of whatever we do here.

I think we should remove that hack from meson and wait to see what are the fallouts.

I will test gstreamer on Windows with the hack removed.

nirbheek commented 2 years ago

I am working on a PR for this that replaces all os.path.join usage while writing out the ninja file to Path.as_posix().

wzssyqa commented 1 year ago

If I really need this, is there a workaround?

tmccombs commented 2 months ago

Could there at least be an option to custom_target to tell it whether it should replace backslashes with forward slashes or not?

I have a meson configuration where I use a foreach to generate several files from a template using m4, with variables set to different values (and the values are pretty small). In one case I need a backslash in one of the variables.

Using a wrapper script doesn't really solve the problem, because the backslash is part of the input. I suppose I could make a script the handles the full loop, but then the targets are less granular.

declantsien commented 2 months ago

https://sr.ht/~lattis/muon/ doesn't have this issue. Which is a meson implementation in C.

theoparis commented 2 months ago

https://sr.ht/~lattis/muon/ doesn't have this issue. Which is a meson implementation in C.

Muon also doesn't support features needed to compile major projects such as GTK.

lf- commented 6 days ago

If meson is going to corrupt the input to custom_target, can this gotcha at least be documented in the manual?