mesonbuild / meson

The Meson Build System
http://mesonbuild.com
Apache License 2.0
5.35k stars 1.53k forks source link

mintro: add install_plan #9130

Closed FFY00 closed 2 years ago

FFY00 commented 2 years ago

This exports more detailed information about the files that are going to be installed on the system. When adding support for the various file types in a Python build backend, we need to know where the files came from so that we can place them in the correct place. Doing this with installed and targets requires us to use fragile heuristics.

codecov[bot] commented 2 years ago

Codecov Report

Merging #9130 (221cfc8) into master (1dce556) will decrease coverage by 3.03%. The diff coverage is n/a.

:exclamation: Current head 221cfc8 differs from pull request most recent head 50de8dc. Consider uploading reports for the commit 50de8dc to get more accurate results Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9130      +/-   ##
==========================================
- Coverage   66.73%   63.69%   -3.04%     
==========================================
  Files         378      193     -185     
  Lines       84563    42393   -42170     
  Branches    17499     8751    -8748     
==========================================
- Hits        56429    27002   -29427     
+ Misses      23345    13040   -10305     
+ Partials     4789     2351    -2438     
Impacted Files Coverage Δ
interpreterbase/baseobjects.py 76.08% <0.00%> (-12.38%) :arrow_down:
interpreterbase/helpers.py 62.50% <0.00%> (-10.00%) :arrow_down:
interpreterbase/decorators.py 90.10% <0.00%> (-3.17%) :arrow_down:
interpreterbase/_unholder.py 71.42% <0.00%> (-2.49%) :arrow_down:
interpreterbase/interpreterbase.py 70.65% <0.00%> (-1.72%) :arrow_down:
compilers/mixins/visualstudio.py 27.09% <0.00%> (-0.56%) :arrow_down:
interpreter/mesonmain.py 86.02% <0.00%> (-0.32%) :arrow_down:
ast/interpreter.py 80.39% <0.00%> (-0.25%) :arrow_down:
modules/pkgconfig.py 82.08% <0.00%> (-0.16%) :arrow_down:
backend/ninjabackend.py 71.28% <0.00%> (-0.05%) :arrow_down:
... and 160 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1dce556...50de8dc. Read the comment docs.

FFY00 commented 2 years ago

it needs to be documented in docs/markdown/IDE-integration.md

Ack.

I guess this essentially completely deprecates intro-installed.json in favor of intro-installed_targets.json, unfortunately it cannot be upgraded in place.

Eh, I don't think so :laughing:. intro-installed.json is still very useful when you don't need this much data. I think it overall makes writing simple integrations easier.

FFY00 commented 2 years ago

Yeah, I will surely document it when I update the PR. I still want to do some testing to make sure we expose the best information for the task described. What would actually be optimal is exporting the install paths with placeholders (eg. {py_purelib}/example.py instead of /usr/local/lib/python3.9/site-packages/example.py), but I suspect that meson is not really prepared for it, and it would a fair bit of internal changes. I need to explore that to make sure.

FFY00 commented 2 years ago

Okay, it seems reasonable.

I have some code locally that is able to generate a file like this:

{
  "targets": {
    "example_exe": [
      "/home/anubis/git/meson-python-example/build/example_exe",
      "{bindir}/example_exe"
    ]
  },
  "headers": {},
  "man": {},
  "subdirs": {},
  "data": {
    "some_data": [
      "/home/anubis/git/meson-python-example/some_data",
      "{datadir}/test/some_data"
    ]
  },
  "python": {
    "example.py": [
      "/home/anubis/git/meson-python-example/example.py",
      "{py_purelib}/example.py"
    ],
    "binary": [
      "/home/anubis/git/meson-python-example/binary",
      "{py_platlib}/binary"
    ]
  }
}

I still need to refactor my implementation a little bit and implement the missing keys (headers, man, subdir), but I am optimistic overall.

That information is much more useful for me, and does not deprecate intro-installed.json, as that file contains absolute paths and this one contains placeholders :grin:

mensinda commented 2 years ago

I don't think that the install locations should be duplicated. Either only with the placeholder or only the full path. Also, the values of the placeholders should be included in the output file.

If you really want to store both values, use a dict instead of a list, since dicts are easier to extend.

FFY00 commented 2 years ago

I don't think that the install locations should be duplicated.

I am grouping files by the data types, it seems you are suggesting I group by the install location. That will make interacting with the data much more painful, as it will force me to iterate over the whole data even when I only want to access one type.

Eg.

for data_type in data:
    if data_type == 'targets':
        ...

instead of data['targets']

This roughly follows the intro-installed.json but with the files grouped per data type.

Either only with the placeholder or only the full path.

It is not super clear, so I am not sure if you got that, but this follows the mapping {data_type: {file_name: (source, destination)}}. My goal here is to know the placeholder-ed path of the destinations, adding placeholders to the sources is irrelevant as they are inherently tied to this metadata (i.e. the build or source dir can never change), in fact, it only adds more work for users by forcing them to do the expansion. If you really think the source paths should use placeholder, I can do that, I just don't really see any functional reason to, only the opposite :stuck_out_tongue:

Also, the values of the placeholders should be included in the output file.

That would make the code much more complex because the expansion happens very early. Do you have any use-case for this? I think they should only be needed when introspecting the installation itself. Anyway, this is something we could add later if needed, or do you think it must be included in this PR?

If you really want to store both values, use a dict instead of a list, since dicts are easier to extend.

I used a tuple, which then gets converted to list when it is serialized to json. We can replace it with a dictionary, if you think thats better.

FFY00 commented 2 years ago

Actually, we should probably just remove the redundant information. It makes it a little bit less nice for some users to use but has less complexity.

{
  "targets": {
    "/home/anubis/git/meson-python-example/build/example_exe": "{bindir}/example_exe"
  },
  "headers": {},
  "man": {},
  "subdirs": {},
  "data": {
    "/home/anubis/git/meson-python-example/some_data": "{datadir}/test/some_data"
  },
  "python": {
    "/home/anubis/git/meson-python-example/example.py": "{py_purelib}/example.py",
    "/home/anubis/git/meson-python-example/binary": "{py_platlib}/binary"
  }
}

This is very similar to intro-installed.json, the differences being that the files are grouped by data type and the destinations have placeholders, which makes it possible to package those files to a python wheel without any fragile heuristic.

@thiblahute do you have any thoughts on this? As mesonpep517 would potentially benefit from this information.

thiblahute commented 2 years ago

This looks useful to me but as @mensinda already said, I think we should make the data extensible to avoid repeating the mistake that was made with the intro-installed.json file which is not extensible at all. Your last example doesn't allow for more information to be added about a specific file I have the impression. For example I don't see how you will add data about install tags.

I think generally speaking this is already very close to what would 100% fit our needs in mesonpep517 and would make my life simpler than trying to get the info back from the target introspection file.

FFY00 commented 2 years ago

Okay, then we can go back to the original design but use a dictionary instead of a list, like @mensinda proposed, which can have a tag key.

{
  "targets": {
    "example_exe": {
      "source": "/home/anubis/git/meson-python-example/build/example_exe",
      "destination": "{bindir}/example_exe",
      "tag": "something"
    }
  },
  "headers": {},
  "man": {},
  "subdirs": {},
  "data": {
    "some_data": {
      "source": "/home/anubis/git/meson-python-example/some_data",
      "destination": "{datadir}/test/some_data",
      "tag": "something-else"
    }
  },
  "python": {
    "example.py": {
      "source": "/home/anubis/git/meson-python-example/example.py",
      "destination": "{py_purelib}/example.py"
    },
    "binary": {
      "source": "/home/anubis/git/meson-python-example/binary",
      "destination": "{py_platlib}/binary",
      "tag": "this-one-has-a-tag-too"
    }
  }
}
FFY00 commented 2 years ago

I refactored the code a bit and adjusted the format.

{
  "targets": {
    "/home/anubis/git/meson-python-example/build/example_exe": {
      "destination": "{bindir}/example_exe",
      "tag": "something"
    }
  },
  "data": {
    "/home/anubis/git/meson-python-example/some_data": {
      "destination": "{datadir}/test/some_data",
      "tag": "something-else"
    }
  },
  "python": {
    "/home/anubis/git/meson-python-example/example.py": {
      "destination": "{py_purelib}/example.py"
    },
    "/home/anubis/git/meson-python-example/binary": {
      "destination": "{py_platlib}/binary",
      "tag": "this-one-has-a-tag-too"
    }
  },
}
FFY00 commented 2 years ago

Okay, I am fairly happy with the end result, the only issue for my use-case is the custom and configure targets, which generally receive the install dir via get_option (eg. install_dir: get_option('includedir')), but we can make do with this for now and improve it later.

lgtm-com[bot] commented 2 years ago

This pull request introduces 1 alert when merging 4aef6a5789f7725bee0b333bf237906460bff65e into e2f4126e415abedf0af90a30332e5e72c98b3d9e - view on LGTM.com

new alerts:

lgtm-com[bot] commented 2 years ago

This pull request introduces 2 alerts when merging 17cdadd0c24d15c3a43e2c22f0c651eda7ba0db7 into e2f4126e415abedf0af90a30332e5e72c98b3d9e - view on LGTM.com

new alerts:

jpakkane commented 2 years ago

we need to know where the files came from so that we can place them in the correct place

The obvious question here is why do you need to put them somewhere else in the first place? Why does a plain install not put them in the correct place?

FFY00 commented 2 years ago

My use-case is generating a Python package (see #7863), not doing an install. I need to put the files in the correct location in the package structure. Doing an install with DESTDIR set and looking at where meson places the files is not possible as different locations, like Python purelib and platlib, might be set to the same path. Having this file also makes the tag information easily available to us.

jpakkane commented 2 years ago

I need to put the files in the correct location in the package structure. Doing an install with DESTDIR set and looking at where meson places the files is not possible as different locations, like Python purelib and platlib, might be set to the same path.

Note that Meson does postprocessing to the libraries it installs. Merely copying the files from the build dir to wherever you want will not work. Running an install is the only way to get working binaries.

If you just want to run things from the build dir, then it might work. But any other use will not.

FFY00 commented 2 years ago

That makes things slightly more difficult, but not much. I can perform an install and copy those files instead, I still need the information in this file though.

Btw, is there any documentation regarding the postprocessing in the install process?

eli-schwartz commented 2 years ago

There is no documentation, because it's supposed to be invisible to end users. :)

It might be nice to have developer documentation for people developing on meson, but that's a bit of a low priority I imagine.

Off the top of my head, meson will edit libraries/binaries to set the correct rpath (and/or remove the one meson sets up pointing to the builddir).

FFY00 commented 2 years ago

That was my guess :smile:

tristan957 commented 2 years ago

In an effort to not have issues with an in-place upgrade of the schema in the future, could you not do something like:

{
  "version": 1,
  "targets": {
  }
}
FFY00 commented 2 years ago

Added a version key.

FFY00 commented 2 years ago

The format itself is looking good for me, but it should be documented a bit better in the docs besides the one line. Just one generic example of the output with a few comments would go a long way.

Done. Please let me know if it looks fine.

Implementation-wise, would it be possible to directly work with the {placeholder} strings in the backend and then expand them in minstall? This could potentially reduce the complexity of the code quite a bit and would automatically ensure that Meson follows the install plan. Or would this be an even bigger refactoring?

I think using the placeholders and doing the expansion in minstall would be great! Unfortunately, I don't have time to do such a big refactoring. I experimented with it a little bit but deemed it too difficult/expensive for the time I have available.

If that's alright, I think the best step right now would be to merge this as-is, which is not that awful anyway, and then do this refactoring when someone has time :sweat_smile:

Moving to the placeholder approach internally would also open improvements such as making get_option return a placeholder and have it be used in intro-install_plan.json. Currently, configure_file(..., install_dir: get_option('datadir')) will show up as {prefix}/share/..., which is not optional, but isn't a big deal anyway.

mensinda commented 2 years ago

I experimented with it a little bit but deemed it too difficult/expensive for the time I have available.

Fair enough. This LGTM and is good to go then besides:

In an effort to not have issues with an in-place upgrade of the schema in the future, could you not do something like

I have to disagree here. We should really never do breaking changes to the existing introspection format. The only exception here was the target introspection with the introduction of the "new"/current file-based API. And even there I manually searched for projects on GitHub and GitLab for the use of the introspection API and checked that breaking it was OK. This becomes less and less feasible as time passes though.

In short, we should not need a version here, since we are realistically never going to break the introspection API again. We don't have a version for any of the other formats and most projects use the Meson version already to check for introspection compatibility. Also, just using an integer is a bad idea. If we need a version it should use semantic versioning to better differentiate between breaking and non-breaking updates.

So please remove the version key again. If we decide we really want a version key, all outputs should have one, which is a discussion for a new PR.

mensinda commented 2 years ago

The CI is green. @eli-schwartz is this also good to go from your perspective?

FFY00 commented 2 years ago

Friendly ping, I would like to get this merged before we have conflicts again :sweat_smile:. Is there anything else needed from my part?

FFY00 commented 2 years ago

Friendly ping. Is anything else needed?

jpakkane commented 2 years ago

When adding support for the various file types in a Python build backend, we need to know where the files came from so that we can place them in the correct place.

I am still very strongly of the opinion that this is the wrong approach. What we need to is to make things install to the proper locations automatically, not to have to make downstream things to fix things after-the-fact.

FFY00 commented 2 years ago

When adding support for the various file types in a Python build backend, we need to know where the files came from so that we can place them in the correct place.

I am still very strongly of the opinion that this is the wrong approach. What we need to is to make things install to the proper locations automatically, not to have to make downstream things to fix things after-the-fact.

Again, I am not trying to fix the installation, I am building distributable Python package artifacts. Direct installs (eg. meson install, where Meson itself places the file into the system) are not compatible with the Python packaging standards -- there's no way to use meson install with pip or other Python package manager.

The only proper way to use Meson for Python projects that need to be distributable via PyPI (eg. scipy, numpy, etc) is to somehow build Python artifacts, wheels to be specific. So, we need this metadata to be able to build the artifacts, we are not trying to fix installs after-the-fact!

This is a deliberate design decision of the Python packaging architecture.

FYI, here's the build backend interface we are working with: https://www.python.org/dev/peps/pep-0517/ Also, maybe it would help if you had a look at an actual wheel file. See https://pypi.org/project/numpy/1.21.2/#files for eg., you can select any .whl file from there to get an idea of what we need to build.

eli-schwartz commented 2 years ago

This is not about meson installing things to the wrong place. This is about meson installing things to the right place, and a post-install hook being used to convert the installed software into an installer file.

Instead of packaging it up into a macOS .dmg or a Windows .msi, it is getting packaged into a PyPI .whl

This requires knowing when a file is correctly installed to $dirpath, whether it was installed there because it is a pure: false file, or whether the current OS makes no distinction between the two.

thiblahute commented 2 years ago

I am having a look at it for mesonpep517 and I have the impression that the tag key should become tags and be a list what do you think @FFY00 ?

eli-schwartz commented 2 years ago

Hmm, why? This is only being filled out with the value of the install_tag: str argument to installable targets. Each installable target is only permitted to have one install tag.

thiblahute commented 2 years ago

I should have checked that, but I wonder why we would limit to one tag for each target, in meson we could handle it to have several values easily, in the json, that would break consumers.

tristan957 commented 2 years ago

That is an @xclaesse question.

eli-schwartz commented 2 years ago

I believe the intention was that if multiple tags installed a single target, then if you install each tag into a different DESTDIR, say, you'd end up installing things twice. Whereas if each target has one install tag, then you specialize the tag and just pass multiple tags to meson install.

xclaesse commented 2 years ago

As @eli-schwartz said. Also, I preferred limiting to a string because we can always extend later if there is a use case. Always easier to extend later than restrict later. @thiblahute do you have a use case for a target that would need more than one tag?

thiblahute commented 2 years ago

I do not yet, I am just saying that in the json we should make it extensible from the beginning as we won't be able to make it happen later without breaking the format.

FFY00 commented 2 years ago

I have a use-case for multiple tags, which is so that we can mark certain targets that might need some special handling. We can't update the entry to be a list, I will hopefully send a PR this week, or early next week.

xclaesse commented 2 years ago

Note that multiple tags open the question: does it install targets with all tags or one of the tags.

I have a use-case for multiple tags, which is so that we can mark certain targets that might need some special handling.

Not sure what you mean here.

FFY00 commented 2 years ago

Note that multiple tags open the question: does it install targets with all tags or one of the tags.

I would expect one, but in the future we could add an option to match all tags if it makes sense.

Not sure what you mean here.

I mean, basically to flag something to intro-install_plan.json consumers.

eli-schwartz commented 2 years ago

Why can't you just specialize the tag and handle multiple tags?

Or to put it another way: why is intro-install_plan.json more special than meson install that it has a bigger need for multiple tags per target?

FFY00 commented 2 years ago

Because I might want to combine flags, and because that would kinda break meson install --tags, as those targets would now have a specialized and functionally meaningless tag as far as the install step is concerned.

If tags are to support the use-case of using them to flag something to external tools that introspect the build, we should probably support multiple tags. If not, and the only use-case is meson install --tags, then everything is probably fine as-is. IMO, allowing multiple tags is cheap, as it does not actually increase the maintenance burden and design complexity much, so I would do it. Your mileage may vary :smile:, so let me know what you want to do. If we do want to change something, we should move fast, as the implementation should not be released in its current form.

eli-schwartz commented 2 years ago

I am still not really sure what additional tag your use case is looking to use. So without understanding what you are trying to accomplish I can't really say whether it makes sense to implement it in this way.