mesonbuild / meson

The Meson Build System
http://mesonbuild.com
Apache License 2.0
5.33k stars 1.53k forks source link

mintro: Save introspection to disk and --targets modifications #4547

Closed mensinda closed 5 years ago

mensinda commented 5 years ago

This adds two new keys for each target returned by meson introspect --targets. The goal of this pull request is to provide a standard way for autocompletion support for IDEs. Being able to build the project with this information is not the goal.

With this approach, the autocompletion support would also work when a compile_commands.json is not provided. As far as I know, only the ninja generator currently supports this.

mensinda commented 5 years ago

@textshell You also might be interested in this.

mensinda commented 5 years ago

The Travis error seems unrelated:

The command "if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then docker pull jpakkane/mesonci:bionic; fi" failed and exited with 1 during .

textshell commented 5 years ago

Thanks for tagging me in.

In the qtcreator plugin we are currently parsing compile_commands.json . But we need to do horrible things to get the targets. So we need to fix that some day^^

But i'm not convinced that a subprocess based api is really what an IDE wants here. I would prefer meson to always export an extended version of compile_commands.json or something similar.

Why?

Of course meson could generate the introspection information in a well known path on configuration in this format too, so that objection is of course only partially relevant.

I'll do a review of the PR anyway, in case we decide that we want to have something in this format (which of course is simpler and more compact than build commands)

mensinda commented 5 years ago

I agree that having such a file would be preferable, but I don't know if generating it in the backend would be preferable. As far as I know (correct me if I am wrong), all compilation flags are generated by each generator. Generating an extended compile_commands.json or a meson specific format (directly in meson) would then require a separate generator.

I think that the best approach for IDE integration is to extend the introspection API with all required functions and dump the entire introspection information somewhere in the build directory after each configuration.

textshell commented 5 years ago

Meson can only have one backend per build directory afaiu. But i think it would not be too hard to add code to the ninja backend to export information similar to the compile_commands.json itself.

And from my perspective the other backends are really not relevant. The other working backends are for visual studio (the full IDE not vs code) which i don't think anyone expects to mix with an completely different IDE. And then there is the xcode backend which is similar, except that it doesn't even fully work.

And if we ever add a backend that is similar to the ninja one we will have to refactor a lot of code into some common class anyway.

@chergert do you have any insight from the gnome-builder perspective?

chergert commented 5 years ago

Builder always has the concept of an active configuration, so focusing on access to flags without a build directory is not a priority for us and would likely go unused.

textshell commented 5 years ago

@chergert Sorry, this was missing a bit of context it seems.

This PR is not about unconfigured sources, but about how an IDE will get the exact options used to build sources which it needs for a code model that supports features like precise auto completion, find references/find definition and similar features.

chergert commented 5 years ago

Oh I see. I guess since we always use Ninja we already have that. We also do a large amount of post-processing on the results to adjust for containers, translated sysroots, system includes, etc.

jpakkane commented 5 years ago

Rather than exporting things one by one and have users parse those together, could we instead do something different. In Meson the set of compiler arguments are guaranteed to be identical for all files of one language in a given target. So maybe we could have something like:

meson introspect --compilecommand --target=foo@id --language=c

which would return something like:

{
    "command": ["gcc", "-DFOO", "-O2", "-g", "-o", "@OUTPUT@", "@INPUT@"]
}
textshell commented 5 years ago

I would rather not commit in the public introspection API to always have identical parameters, because we might some day have a case where we have to use different parameters. Even if that means i have to the grouping manually again. Meson might in the future offer compilations that use different parameters without beeing in an independent target.

With per source (and per generated source) information the IDE additionally could get the correct language from meson, that would avoid each IDE having to emulate mesons compiler selection (which is only going to be more complicated it seem)

Another complication is that both_library will in the not to distant future compile each file in a target with two different sets of parameters (at least windows needs different -D parameters for common use cases of shared and static libraries). In qtcreator that would likely easily map into distinct projects parts with qtcreators native UI for selecting different compilation situations for a given source file.

mensinda commented 5 years ago

I actually prefer it separated since this would make things easier for me (KDevelop meson plugin). With such a format I (and any other IDE) would have to parse the compile flags to extract include directories (-I<PATH>, /I<PATH> on Windows (I think)). Also, I don't really care for the compiler used or the "-o", "@OUTPUT@", "@INPUT@".

The goal for me is not to be able to build the file. That's what's ninja is for (or the compile_commands.json). What I need for the KDevelop IDE integration are only the include directories, the compile definitions -D<var>=<something, the c / cpp std version and maybe some other compiler flags.

I can, of course, only speak for myself here. I don't know if the full command line is needed for other IDE's.

Also, I don't see how your proposal is different to always printing the compiler args and include directories directly in the target (this is how it is handled in this pull request). Having to do another call to meson would complicate things.

textshell commented 5 years ago

I think we want to have the complete command-line or at least all options. I think ultimately IDEs using clang as code model have to translate back to compiler commandline anyway. And more compiler flags matter than just -D and -I, because a lot of flags have effects that are detectable as (conditionally defined) preprocessor macros or via c++ compile time mechanisms.

For example Qt headers something just #error out when they believe that the compilation runs in a unsupported mode (e.g something with PIC settings).

Of course that does not mean, that meson might not also provide a simpler view, but in that case meson has to interpret from the raw command line arguments from my PoV because we really can't tell the users that -I and -D statements in the cross file (or the environment variables or the shiny new cross file) are not respected by the IDE interface.

Currently the qtcreator-plugin only extracts -D, -I and -std= from the command line but i suspect we need to expand that in the future.

Another quirk i just remembered while looking at the code is that the IDE of course also needs the default include paths of the compiler. I think it would be nice if meson could offer that information in some form too (if we have that somewhere, but i believe we need that for dependency merging anyway). What about -sysroot? It seems that would be very relevant too when not directly extracting from the commandline, or even if when the compiler has a default sysroot.

mensinda commented 5 years ago

@textshell Having the compiler flags for all files would massively bloat the output. We can always add this feature once it is actually required. This would, of course, require changing the IDE.

But we could change the output format for --target-files to a dict instead of a list:

{
  "c": ["list", "of", "all", "c", "files"],
  "cpp": ["list", "of", "all", "cpp", "files"],
  "vala": ["..."]
}

This way the language detection would be trivial.

mensinda commented 5 years ago

I have nothing against extracting all command line parameters that change the behavior of the compiler -fPIC, -g, -std=c++17, etc. However, I am against including parameters like -o INPUT OUTPUT and the compiler itself, because these parameters would have to be removed.

textshell commented 5 years ago

Having the information in a format that includes per target the options and a list of files where these options apply would be almost exactly the data model i need in qtcreator. So that would of course also be a very nice interface.

Assuming this is per target:

[
  {
    "language": "c",
    "compiler": "/path/to/gcc",
    "parameters": ["-Dsomething", "-I/opt/foo"],
    "sources": ["list", "of", "all", "c", "files"],
    "other_sources": ["list", "of", "generated", "and", "dependency", "sources"]
  },
  ...
]

With potentially multiple sections even for one language in case we ever need that.

What we would lose there is having an mapping which output relates to which compiler settings. But i don't think we actually need that for anything i can think about (i.e. neither IDEs nor the usual other tools like linters and other source based checkers)

mensinda commented 5 years ago

This idea is pretty good. But I would put this interface directly into the --targets command. This way we keep backward compatibility and we get all the information we need with a single call.

The --targets command would then output something like this:

[
  {
    "name": "modbusSMA",
    "id": "76b5a35@@modbusSMA@sha",
    "filename": "lib/libmodbusSMA.so",
    "type": "shared library",
    "installed": true,
    "install_filename": "/usr/local/lib/libmodbusSMA.so",
    "build_by_default": true,
    "compile_commands": [
      {
        "language": "c",
        "compiler": "gcc",
        "parameters": ["-Dsomething", "-I/opt/foo", "-std=c++17", "-O2", "-g"],
        "sources": ["list", "of", "all", "c", "files"],
        "other_sources": ["list", "of", "generated", "and", "dependency", "sources"],
        "include_directory_prefix": "-I",
        "defines_prefix": "-D"
      },
      {...}
    ]
  },
]

The include directories should also be absolute and not relative like in the compile_commands.json. I also added the keys include_directory_prefix and defines_prefix. This would simplify parsing the parameters while being compiler independent.

Compiler parameters

The main problem with the compiler parameters is that they can not be easily retrieved by the introspection API since they are only generated in the backend and not stored anywhere. This means that the introspection API has to recreate what the backend does. This is why the extra_args field is currently limited to only -D parameters.

A cleaner approach would be to have the backend store all flags for each compiler in the target object and have the introspector parse that. However, I don't know if this is feasible and/or wanted since the introspector would then depend on the backend.

jpakkane commented 5 years ago

Moving the command line generation away from the backend to some more general place would make sense. It would allow us to easily generate things such as compilation tests (i.e. "try to compile source X with settings Y as a test", usually done as a failing test).

textshell commented 5 years ago

Sure having the compilation information together with the general target information seems useful. Not sure if we should overload the existing api though because this combines the old target-files into the targets command.

As i prefer to have a file based API anyway, i could try to just have the ninja backend generate this information as a side effect in say meson-introspection/targets.json. We could later refactor that code if we think it's worth the effort to move code outside of the generator.

mensinda commented 5 years ago

With the approach from #4548, a complete introspection dump could be easily generated after each configuration. An introspection_dump.json next to the compile_commands.json could then be generated here. This would have the advantage of never having to call the introspection API in the first place.

Overloading the --targets API does have the advantage of not breaking backward compatibility (not sure if we need it though). Alternatively, we could create a new API and leave --targets and --target-files untouched and deprecate them maybe.

textshell commented 5 years ago

I've thrown together a quick prototype of generating this in the ninja backend: https://github.com/qtcreator-meson-plugin/meson/commits/feature/backend-introspection

It's certainly not yet complete. But i think this could actually work.

mensinda commented 5 years ago

This proof of concept looks very promising. However, I don't think that generating introspection files directly in the generator is the right way to go. You are duplicating code in the generator that is already in mintro.py. I think a better approach would be to store the compile commands in the target (or somewhere else) so that the introspection API has access to them. If #4548 is merged, all introspection information would then be dumped into a JSON file.

This approach has the advantage that no code is duplicated and the "old" API would also benefit from the changes to the backend.

mensinda commented 5 years ago

Updated the introspection format for --targets to:

[
  {
    "name": "target1",
    "id": "MESON ID",
    "sources": [
      {
        "language": "cpp",
        "compiler": ["c++"],
        "parameters": [
          "-I<INC DIR 1>",
          "-I<INC DIR 2>",
          "-DFOO=bar"
        ],
        "source_files": ["list", "of", "cpp", "files"]
      },
      {
        "language": "c",
        ...
      }
    ],
    "filename": "lib/libsomething.so",
    "type": "shared library",
    "installed": true,
    "install_filename": "/usr/local/lib/libsomething.so",
    "build_by_default": true
  },
  {
    "name": "other target",
    ...
  }
]
textshell commented 5 years ago

Well i think the code duplication for the introspection would be fairly easy to handle by moving it to a common place if we want to. The only thing i had to change was the way the install location is found. And i'm not sure if the way used there would not also work in mintro.py.

On the other hand, the target function in mintro could also be changed to read the introspection data from ninjabackend. (either just pipeing it through or maybe stripping out the new information if needed)

At some point we do have to think about performance too. It might not be a be good idea to put essentially the whole of the instrospection json into coredata.dat. So the place to store those details might be another file anyway. I wonder if it's worth it to safe this in intermediary format instead of just emitting the finished json.

We need to benchmark with one of the huge project how much the build option storing costs in terms of performance to see if this needs to be optional. Of course it would be nicer if we can just always place the data in the build directory.

Do you think it would be important to have the different pieces of information in one file? I would expect reading only the needed information from a smaller files would be easier and allow better performance with a file based api.

mensinda commented 5 years ago

I don't think that the introspection data will be that big. With the current setup of the PR only the compiler parameters for each target have to be stored. My point is that we already have a central file for handling introspection (mintro.py) and we only need the full list of compiler parameters from the backend. This is assuming I didn't miss something.

Generating a complete introspection dump on (re)configure can then also be handled by mintro.py (see #4548 for a possible implementation).

I am not sure about splitting up the introspection file. Yes, there might be some performance gains by splitting it up, but how big will this file realistically be? Maybe a few MB at most. I checked the file size for systemd (2.9 MB with pretty spaces and 2.5 MB minified) and mesa (80 KB). Reading and parsing this file shouldn't take long. That being said it certainly wouldn't hurt splitting the targets (most likely to be the biggest section) from the rest of the introspection.

mensinda commented 5 years ago

I have merged this branch with #4548 since these changes are now required for this PR. I also added the get_introspection_data(self, target_id, target) function to the base Backend class. It takes the target ID and the target itself and should return list as specified in the comment.

This PR will now fail because of the bug described in #4555. The following tests are affected:

@textshell The format of get_introspection_data is slightly different from what we discussed in the IRC. This way we can use the logic I have already written as a fallback.

mensinda commented 5 years ago

I don't think that splitting the introspection file into multiple parts is required or would bring a significant speed benefit for an IDE. The largest introspection file with the current changes is 4MB (systemd) and python can parse it (with json.load) in under 50 msec. Further processing in an IDE should not increase this processing time over 100 msec, which is acceptable in my opinion.

I have also done some analysis about the relative size of the sections in the JSON dump here.

mensinda commented 5 years ago

This PR has changed quite a bit from its initial version. Here is a summary of the current changes:

sources entry format:

{
    "language": "language ID",
    "compiler": ["The", "compiler", "command"],
    "parameters": ["list", "of", "compiler", "parameters"],
    "source_files": ["list", "of", "all", "source", "files", "for", "this", "language"]
}

Overhead

The additional reconfigure time overhead for this PR is: For efl, the average reconfigure time went up from 12.605s to 12.684s (0.63% increase). For a small test project, the time went up from 0.683s to 0.689s (0.86% increase).

mensinda commented 5 years ago

Changes since the last comment:

Also, I consider this PR to be feature complete now and no longer WIP.

textshell commented 5 years ago

A few additional things.

I wonder if we want to use "intro-" as prefix for the introspection files. Meson usually tries to avoid abbreviations.

We now have a lot of absolute paths all over the place. That makes it harder to move source directories around. Do we care?

This changes the reported environment for tests from basing it on the environment from the call of the introspection call to the environment from regeneration the build files. I think having the introspection data depend on the environment in this way is something to look at separately. I suppose we just accept the change here for now.

Custom targets currently have not so great information: (this example is a manual custom target, not using the qt module)

    {
        "sources": [
            {
                "sources": [
                    "/home/martin/sw-dev/termfoo/git/toolkit/subprojects/qtbase-5.10.0/src/corelib/animation/qabstractanimation.h"
                ],
                "parameters": [],
                "language": "unknown",
                "generated_sources": [],
                "compiler": []
            }
        ],
        "id": "moc_qabstractanimation.cpp@cus",
        "installed": false,
        "name": "moc_qabstractanimation.cpp",
        "build_by_default": false,
        "filename": "subprojects/qtbase-5.10.0/moc_qabstractanimation.cpp",
        "type": "custom"
    },
mensinda commented 5 years ago

I wonder if we want to use "intro-" as prefix for the introspection files. Meson usually tries to avoid abbreviations.

Sure I can rename the files to introspection-, but isn't that a bit unnecessarily long? If abbreviations should be avoided I would propose to use info-, dump- or something else (as long as it is short) as a prefix.

We now have a lot of absolute paths all over the place. That makes it harder to move source directories around. Do we care?

I don't think that moving code around will happen often and a simple meson --reconfigure can solve this problem. I would like to use absolute paths everywhere (in the future breaking changes PR) so that there is no ambiguity to where the all the files are relative to. This also reduces the complexity of an IDE integration.

Custom targets currently have not so great information: (this example is a manual custom target, not using the qt module)

Yes, I will try to make these entries more useful.

textshell commented 5 years ago

I wonder if we want to use "intro-" as prefix for the introspection files. Meson usually tries to avoid abbreviations.

We have decided on irc, that this is fine.

textshell commented 5 years ago

@jpakkane i think this is ready. Could you have a look if this is acceptable as new feature? I would very much like to use this in the qtcreator plugin and @mensinda likely as well in the kdevelop support for meson.

The new compute_parameters_with_absolute_paths in the compiler classes could use some review from someone who known these compilers, but they are not super critical, as they are only used in the new output, so can not make regressions and not having absolute paths in the introspection data can be fixed when we see something missing.

jpakkane commented 5 years ago

Overall this is good stuff. There are just a few minor niggles to clean up, such as rewrapping the documentation to 80 columns.

The documentation should probably explicitly mention that the various flags returned are not the same as what would be used in the actual compilation commands. For example the -I flags get converted to absolute paths.

mensinda commented 5 years ago

I have updated the documentation and have trimmed the lines to 80 columns as much as possible. I didn't touch the JSON objects and the tables.

jpakkane commented 5 years ago

Do we really need to be able to specify the width of JSON prettyprinting? Seems like an on/off toggle should be enough (with indentation either 0 or 4). Going even deeper, could we not just always create the data prettyprinted? If there is no performance impact then I can't think of a reason to produce compressed json.

mensinda commented 5 years ago

I think that having at least an ON/OFF toggle is preferable since the introspection API is mostly used for IDE integration. Here the extra newlines and spaces only take up extra bandwidth/disk space.

Concrete example: the meson-tragets.json for gst-build is compressed 2.6MB big and pretty printed 4.0MB. From my test, loading the bigger file is around 25% slower than the compressed file (and I am using an SSD).

Would it be OK if I changed the current indentation flag to only toggle pretty printing and add an addional flag to adjust the indentation width?

jpakkane commented 5 years ago

There should be at most one option for this. Keeping the number of arguments low is important for usability. Would an on/off toggle be enough?

mensinda commented 5 years ago

Sure, an on/off toggle is enough for this (updated with the latest commit).

I just personally like having more options :)

jpakkane commented 5 years ago

LGTM but as this is a big change I'd appreciate a second (well, third) set of eyes. Any volunteers? @nirbheek?

Though we should remember that any change we do here can be changed almost arbitrarily until it goes out in a release.

nioncode commented 5 years ago

So will this only work for the ninja backend at the moment? Do you plan to add this to the other backends as well?

mensinda commented 5 years ago

Yes, this currently only works with the Ninja backend. However, there is a fallback for in the base Backend class, so all the sources should be listed.

I don't have any plans to add this to the other backends because there is no meson support in the target IDE's anyway (XCode, Visual Studio).

textshell commented 5 years ago

Yes, this currently only works with the Ninja backend. However, there is a fallback for in the base Backend class, so all the sources should be listed.

Most of the work in this PR works with all backends. What does not work is the precise information of the compiler options. Currently the sources part in the target introspection is fairly limited outside when not using the ninja backend, when needed this could be expanded a bit. But the usecase is mostly to allow for code models / find references in IDEs other than Visual-Studio and XCode to work. So this is not as important with the both other backends.

Of course someone with good knowledge of Visualstudio could likely get that backend to closely match actual compilation commandlines to the introspection data. The architecture is open for all backends to implement that.

jpakkane commented 5 years ago

LGTM. Are there still open issues with this or can we get this merged?

textshell commented 5 years ago

no open issues from my side.

mensinda commented 5 years ago

I think everything is resolved.

jpakkane commented 5 years ago

Merge it is then.