mesonbuild / meson

The Meson Build System
http://mesonbuild.com
Apache License 2.0
5.55k stars 1.61k forks source link

Design for generator improvements #6526

Open jpakkane opened 4 years ago

jpakkane commented 4 years ago

This is basically a fleshed out #3342. Generators seem to not fulfill requirements that people have for them, so let's see if we can make them better. One possible way of making them better is that if you do this:

g = generator(...)
l = g.process(..., unique_id: 'somename')

It would output the files under ${builddir}/meson-gen/somename (and subprojects under the respective subproject dir) instead of the target's private directory. Then when you use the output in multiple targets, they all use the same generated files.

Why like this? Mainly for backwards compatibility. People have projects that use the target private dir for header lookups and suddenly generating the headers somewhere else would break things. The user must specify a unique id for each such generator (using the same name multiple times would be a hard error). The reason for this is that I could not come up with a way to generate a reliable set of state to hash to create a unique id automatically.

This would make it easier to pass outputs of one generator to another, since the paths are always known and static whereas currently we would have to delay specifying the output dir until the result is used in a target.

The UI is a bit crap and unintuitive, so obviously it would need polishing.

germandiagogomez commented 4 years ago

FWIW I myself started using custom_target in what should be a generator in my opinion.

  1. It looks weird to me that a generator generates the same files again and again when different targets use the output. I ended up using custom_target but what I want is a compiler for idl files (capnproto in my case).

  2. Since this is related to generating code improvements, I also had a related issue when generating the file output (but this is in a custom_target, not sure it affects to generator). The issue is here: https://github.com/mesonbuild/meson/issues/6385

bonzini commented 4 years ago

In QEMU I think we have one use case for reusing the unique name. We generate files from many directories, and we would like to have them available from source files in the easiest way. With Makefiles we use basically -I$(@D), so that can have:

It would be nice if you could have this as something like

trace_h = generator(...)
include_dir = g.private_dir_include(for_source_dir: true)

and trace_h.process('trace.h', unique_name: 'trace_h') in each directory.

Another possibility is, instead of having the unique_name, to have something like

trace_h = generator(..., name: 'trace_h')

and this would cause all instances of g.process to put the files in trace_h@gen/. But perhaps this is a different use case than what you are thinking about, because then having different unique_names would require different generators too.

bonzini commented 4 years ago

Going back to your original proposal, I wonder if all that is needed is a "custom_target template" (suggesting that it could be created with custom_target_template). This would accept the same arguments as custom_target except input and the initial named argument; in addition, output would also support substitution of @BASENAME@, @PLAINNAME@, and possibly @PRIVATE_DIR@. It would then return a factory object similar to generators, so that later on you could do:

tgt = template_obj.process(name, input: inputs)

with exactly the same effect as invoking tgt = custom_target(...).

germandiagogomez commented 4 years ago

@bonzini just my two cents here and I want to say first that I am not sure I fully understand your use case, but more on a philosophical way of looking at things:

@jpakkane I think we should think of the purpose of each of these functions and why they exist. My best guess of what they should be (not sure if they are intended to be like this):

Namely, I tend to see a generator more like getting a bunch of files and processing them and a custom_target as a single rule. I also think (but maybe I am missing some use cases) that when you do this:

foreach something : some_sequence
    custom_target(....)
endforeach

What you probably wanted is a generator and not a bunch of custom_targets. Right now to me this is the part of the API that looks more confusing. I ended up using custom_target but just because of the limitations generator has, not because I wanted to use it in the first place. Are they actually the same thing? Are they fundamentally different?

bonzini commented 4 years ago

I had two proposal that are different:

bjfiedler commented 4 years ago

I think I would profit from a custom_target_generator. Additionally, I'd love to see two points for this new kind of generators:

My meson files contain a lot of sections with a specific pattern. I have compressed it here to a MWE. It basically should process some source files with a custom program (in this case cp) generating some intermediate files and then process the intermediate files again with a custom program (in this case sed) to generate the end result (v_targets). In my real use case cp is a clang -emit-llvm and the sed a transpiler (configurable to different variants) that works on the IR code. After that another set of working steps comes: Compile and link the IR code to executables in different architectures and run statistics.

In the MWE I first write the solution with custom_targets (which works, but results in a lot of copy-pasted code) and after that the solution with generators that would be desirable but not possible at the moment.

project('testproj', 'c',
  version : '0.1',
  default_options : ['warning_level=3'])

#### global configurations (done at the main meson file)
## just some dummy data for MWE
flags_x = ['--expression=s/^/x/']
flags_y = ['--expression=s/a/a__y/']
flags_z = ['--expression=s/b/b_z_z_b/']

cp = find_program('cp')
my_generator_cmd = [cp, '@INPUT@', '@OUTPUT@']
my_generatorflags = []

sed = find_program('sed')
other_generator = [sed, '@INPUT@']

## define some generator as generic recipes to invoke cp and sed
## Later, they are specialized with extra flags.

my_generator_gen = generator(cp,
                             arguments:['@INPUT@', '@OUTPUT@']
                                       + my_generatorflags,
                             depfile: '@PLAINNAME@.dep',
                             output: '@PLAINNAME@',
                            )
other_generator_gen = generator(sed,
                                arguments: ['@EXTRA_ARGS', '@INPUT@'],
                                # just no conflict in depfile and
                                # output because generators rebuilds
                                # on each use
                                depfile: '@PLAINNAME@.dep',
                                output: '@PLAINNAME@',
                                capture: true)

#### config dependent settings. Usually this happens inside a subdir() with
#### conditional evaluation based on get_option
## Ideally, I need two subdir runs: One to collect information (set the right flags) and one to define the generators (based on the configuration).
my_generatorflags += ['--preserve']

#### here the actual processing starts

src_files = ['a', 'b', 'c']
variants = ['x', 'y', 'z']

targets = []
g_targets = []
v_targets = []
g_v_targets = []
foreach src: src_files
  name = src + '.gen'

  # first, use a custom_target
  t = custom_target(name,
                    input: src,
                    output: name, # such a thing like @CUSTOM_TARGET_NAME@ would be nice here
                    depfile: name +'.dep', # here, too
                    command: my_generator_cmd + my_generatorflags) # here my_generatorflags resole to '--preserve'
  targets += t

  # now, achieve the same thing with a generator. However, my_generatorflags does _not_ resolve to '--preserve' here.
  g_t = my_generator_gen.process(src)
  g_targets += g_t

  foreach variant: variants
    v_name = name + '.' + variant

    # again, use a custom_target
    v = custom_target(v_name,
                      input: t, # here, a custom_target output is used as (another) custom_target input
                      output: v_name,
                      depfile: v_name + '.dep',
                      capture:true,
                      command: other_generator + get_variable('flags_'+variant))
    v_targets += v

    # Now, again try to use generators instead of custom_targets
    ## not possible since a generator does not accept GeneratedListHolder as input
    # g_v_targets = other_generator_gen.process(g_t, extra_args: get_variable('flags_' + variant))

    ## not possible since generator doesn't accept CustomTargetHolder as input
    ## as reported in #3667
    # g_v_targets = other_generator_gen.process(t, extra_args: get_variable('flags_' + variant))

  endforeach
endforeach

all_variants = custom_target('some evaluation',
                             output: 'all_combined',
                             command: ['cat', '@INPUT@'],
                             # capture: true,
                             build_always_stale: true,
                             input: v_targets+g_v_targets)

Problems I'd like to show:

ailin-nemui commented 3 years ago

it would be very nice if this improved generator could also support recording of dependencies that are implicit or passed in some form in the EXTRA_ARGS without having to write a python wrapper script that outputs a depfile. In the code below, meson/ninja is unaware that changing either typemap or any of the explicitly referenced typemap files requires re-running the generator.

https://github.com/irssi/irssi/blob/bf41bfa2f7cb52a44f420441696071eedf233160/src/perl/textui/meson.build#L2-L15

deepbluev7 commented 1 year ago

I have now fallen into the trap of using a generator only to notice that it doesn't allow me to install the results. Example number 1:

assemble_bootrom = generator(rgbasm,
  output  : '@BASENAME@.o',
  arguments : ['-o', '@OUTPUT@', '@INPUT@'])

link_bootrom = generator(rgblink,
  output  : '@BASENAME@.bootrom',
  arguments : ['-o', '@OUTPUT@', '@INPUT@'])

truncate = generator(dd,
  output  : '@BASENAME@.bin',
  arguments : ['count=1', 'of=@OUTPUT@', 'if=@INPUT@', 'bs=@EXTRA_ARGS@'])

gbc_rom_sources = [
  'BootROMs/agb_boot.asm',
  'BootROMs/cgb0_boot.asm',
  'BootROMs/cgb_boot.asm',
  'BootROMs/cgb_boot_fast.asm',
  'BootROMs/mgb_boot.asm',
  ]
gb_rom_sources = [
  'BootROMs/dmg_boot.asm',
  'BootROMs/sgb2_boot.asm',
  'BootROMs/sgb_boot.asm',
  ]

gb_roms = truncate.process(link_bootrom.process(assemble_bootrom.process(gb_rom_sources)), extra_args: '256')
gbc_roms = truncate.process(link_bootrom.process(assemble_bootrom.process(gbc_rom_sources)), extra_args: '2304')

build_target('roms', gb_roms, gbc_roms, build_by_default: true, install....) # does not work

Now that is solvable using custom targets, but I basically need to copy and paste the loop 3 times (one is for the boot logo, which needs assembling later too).

Another example was just generating manpages in multiple directories as well as a summary manpage.

Specifically I always default to trying to define how the specific processing steps work separately from the input files. Generators look to be the much better fit for that than doing multiple for loops with slightly different parameters. However you can't do anything with a generated_list. You can't install it, you can only use it as sources for a normal C compiler. You can't even pass them to a custom target just to copy them. Basically I just want to be able to define my weird compilers, that are usually built as part of the project or from a subproject already, and then use that to define targets.

I ran into this 3 times in the last year alone, I really want ANY solution to this. Bonus points if I can define my own extra arguments, like the final binary size in the above example.

robtaylor commented 11 months ago

Yep, hitting a very similar situation to @deepbluev7 when generating pdfs from rst (chaining rst2latex, pdflatex). @jpakkane did you have any more thoughts around this?

robtaylor commented 11 months ago

to make it a bit more complicated, I then have a html page generation that needs those pdfs as input, and its frustrating that custom_target can't take array[InternalDependency].