conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
7.96k stars 952 forks source link

Development sources in project #1796

Closed vadixidav closed 1 year ago

vadixidav commented 6 years ago

Visual Studio project files, and likely others, can potentially include sources and PDBs for dependencies. While some packages have PDBs, none have sources. This is understandable, since typically if you are building a package you do not need sources or documentation, but for people using conan packages during development the requirements are different. I am a fan of #1377, which helps in this regard. It is also ideal to have the sources automatically linked to, in this case, the VS Project file after running conan build.

One example is that I had to debug an application which used Qt, and the package I was using was from a repository rather than a local build where I could grab the source. If there was a source component linked to the same package version I could optionally download as a developer and link into the build projects or use a generator to make a text file with the source directory, it would definitely improve my quality of life and efficiency as a developer. I understand that C++ has always had issues with things like this, especially on windows, and that Conan is novel, but other languages often already have solved this in various ways. Obviously, no alternative exists that can do this for C++ on Windows, but it is definitely a concerning issue from my perspective.

TL;DR

vadixidav commented 6 years ago

The idea to use components mostly comes from this. The way Rust handles components for its analysis tools, cross compilation, and source I think is relevant and applies here to these packages. Theoretically, I would like to specify using some local file, perhaps related to the conan project work, which specifies which components I need, though just a command line option when running install is perfectly fine.

sourcedelica commented 6 years ago

Java has a convention where if you create mypackage-1.2.3.jar, then mypackage-sources-1.2.3.jar (same group) contains the source. To generate the sources is a couple of lines in the pom.xml. In Conan this could be an attribute like generate_sources = True. When you upload your package to the repo it includes the source jar as well.

The beauty of this convention is that in your IDE, if your code calls a function in a third-party package you can navigate directly to the third-party implementation. Control-click, BOOM, you're there. The usefulness of this cannot be overstated :) (The IDE view of the *-sources.jar code is read-only).

Actually, it's not just third-party code that this is useful for, it's any package - especially internal packages.

lasote commented 6 years ago

Thanks, guys. This is interesting. Please, all the information you could have about, for example, how Visual Studio locates the source code or any other tool will be welcomed.

sourcedelica commented 6 years ago

This is what I manually do currently with CLion, which uses CMake as its project definition language:

  1. Download the source for the library that I want to view in my IDE.
  2. Add add_subdirectory(library_source library_source/build)
  3. Change the library that the project is using to the target inside the added subdirectory
  4. Set any required variables that the library project needs.

For example:

# If Cassandra sources are added to the project
set(CASSANDRA_LIBS
        cassandra_static
        ${LIBUV_LIBS}
        ${CASSANDRA_EXTRA_LIBS})

# Adding sources for driver to project for debugging
set(LIBUV_ROOT_DIR /msys64/mingw64)

add_subdirectory(/msys64/home/epederson/dev/cpp-driver-2.6.0 /msys64/home/epederson/dev/cpp-driver-2.6.0/build)

But this is not ideal. For the code navigation to work properly my project has to use the cassandra_static library target that is in the CMakeLists.txt in the added subdirectory. So when I build my project it has to build Cassandra too. And I have to set any variables that project needs - in this case Cassandra's CMakeLists.txt needs LIBUV_ROOT_DIR defined. It's messy as heck. Dying for automation.

vadixidav commented 6 years ago

Perhaps if targets are generated in CMake which link libraries and add source directories, and then CONAN_LIBS provided that, then for projects that use CONAN_LIBS it would add the source directories to Visual Studio. This probably wouldn't work for packages wrapped with Conan that use find_package() though, unless those Find.cmake files also did something to add the sources, but that would be a per-package thing rather than a Conan thing. Conan could pass the source directory (if available) in the environment variables so the Find.cmake provided by the Conan package could see if the sources are available, and if so then add them.

vadixidav commented 6 years ago

I have been using Conan for a bit longer now and I suspect the easiest way to get the sources would be to assume that all sources can be acquired from either the export_source directory or by calling source() explicitly. Then include some developer option (like conan install .. --dev-src) when doing a conan install to force the download of dependency sources and include the source as part of a target in each generator so the sources can be hooked up to the generated project file. For instance, I would expect sources to be connected (though not built) to Qt in a CMake build and then linking $CONAN_LIBS to my target should automatically add the source directories. The inclusion of this flag would result in the sources being automatically downloaded for each dependency in the tree and added to the build helper. This should be possible with CMake. Hopefully it is also possible with QMake as well.

vadixidav commented 6 years ago

It might be necessary for package maintainers to specify what should be considered "source" as well, though from Visual Studio's perspective it just wants a directory with the source and it will find the source files itself.

lasote commented 6 years ago

Thanks for your feedback! I think we have some good ideas now to elaborate when we start with this issue.

jnnnnn commented 6 years ago

I also need this! We are already using exports_sources to specify what we want exported as source with the conan package. However, the only current way to retrieve sources without rebuilding every package is to abuse conan copy and delete the copy afterwards (leaving the retrieved source in the original package).

I am working on a patch to add a conan install --source option that retrieves the entire dependency tree's source snapshots. Is this patch likely to be accepted?

Once the source is available, it should be straightforward to write a generator that will create a Visual Studio solution that includes all the dependencies' project files from the export_source directories.

memsharded commented 6 years ago

Yes, we are willing to accept it, but first better discuss a bit about the details. WHat do you mean with source snapshots. We usually call it the sources that are exports_sources. But can this be applied to packages with a source() method. How will consumers of the source package work with the code? They can only access the package folder.

lasote commented 6 years ago

I think the issue can be separated in two tasks:

  1. Getting the sources of the packages. Here, packaging the sources in the package folder doesn't look like a good approach. Maybe as @jnnnnn said, force somehow to fetch the sources even for binary packages installing. It won't be always possible, when there is no source() method and there are no exported sources, won't be a source folder.
  2. Once we have the source folder in the cache, or always that we have it (decoupled), we can add to the generators the required variables/elements/whatever to let the IDEs locate the sources.
memsharded commented 6 years ago

Here I have one concern. We are talking about the sources of a package, and we are assuming that they will be the ones retrieved with source() or captured with exports_sources. However, many builds, will generate code at build time, like typical config.h headers, but also it is very possible to have code generation for protocols (protobuf), or many other things.

Then, it might not be general enough to capture from the source folder, but rather it is necessary to actually package the sources.

I don't see any reason why this couldn't be handled with the current package() method:

def package(self):
    if self.settings.build_type == "Debug":
         self.copy("*.cpp", src="src", dst="src")
         self.copy("*.h", ....)
         .... # other files too, .pdb, etc

From there, I think it is valid to extend current generators to account for "source" paths (besides include paths, lib paths, etc), to help debuggers with this info. Am I missing something? Thanks for the feedback!

lasote commented 6 years ago

However, many builds, will generate code at build time, like typical config.h headers, but also it is very possible to have code generation for protocols (protobuf), or many other things.

It depends on the usage of the sources, if you want them for development with code exploration, completion etc, you do not need exactly all the sources. It is a trade-off, packaging the sources is more accurate, on the other hand, if a package has not packaged the sources, the consumer can't get them. The real problem is that you really don't know the location of the sources accurately, even calling the source() method, I don't know how the IDEs manage to find the implementations etc.

What I mostly dislike about packaging the sources is to have a monster package.tgz file with the sources inside, when it's clearly optional and won't be used by the most of the users. If we could decouple it, maybe the package() option is a better option.

solvingj commented 6 years ago

We have discussed a similar use case for source management in Bincrafters at length, so here's our very specific point of view on what we'd like to see implemented.

Our primary objective was to work with git super-projects which have inter-connected tree's of git-submodules. A good non-massive example is here: https://github.com/Azure/azure-iot-sdk-c

I think the source management feature should be designed precisely for capturing "pre-build" sources, which are universal and immutable in the same way that release tarballs are today. They should be stored as a separate artifact.

I think the sources generated during build-time (or even some pre-build step) are an important point to bring up. I think those should only be captured post-build in the package method (as I imagine some people might be doing today). In many cases, generated sources are generated differently per-platform, so it gets very complicated to reason about. If the project doesn't put the generated sources in it's SCM system, Conan shouldn't capture it in a sources package.

Furthermore, we actually want one additional functionality with the source management feature for the git-submodule use case. The details of this include:

-the goal is to package all library projects separately, and define the dependency trees in conan -there is a challenge packaging intermediate and super-projects in a git-submodule tree with conan -relative paths for submodule sources are often fixed/embedded in numerous cmake files -the only turn-key solution is to satisfy the expected paths to sources of all dependencies at build time -a new field called 'source_requires' could hold refs to packages with a new sources artifact -a parameter 'relative_dir' on each source_requires statement could define relative extraction path -this provides somewhat of a "lift and shift" option for git-submodule based library suites

memsharded commented 6 years ago

Please find attached a small example I did to try to understand this issue: A Hello library using a "SubHello" package containing only source code.

sourcemgmt.zip

As you can see, I am reusing code in a conan package, all I need to do is to add to my recipe:

    build_requires = "SubHello/0.1@user/testing"

    def imports(self):
        self.keep_imports = True
        self.copy("*", folder=True)

And then use that code easily in my build script.

include_directories(${CMAKE_BINARY_DIR}/SubHello)
add_library(hello SubHello/subhello.cpp hello.cpp)

Feedback welcome.

solvingj commented 6 years ago

Might work, will try!

So, i notice you don't specify src folder in the import, was there a special reason for that? It just seems like a more precise way to bring ONLY the sources out into the current build.

Second point, as @lasote mentioned before, one downside of this is that each binary variant has all the sources, which could be hacked around (with an option + configure() and package_id() manipulation) to make an option that results in a variant just for sources. Currently i think the cost of that hackery is too-high, I'd rather accept the minor wasted space in the binaries in the short term, i just want to make sure I see all the pro's and con's clearly.

olivren commented 6 years ago

I am the author of the issue referenced above, which is indeed related to this proposal.

In our project that use Conan, our build system (SCons) is not tied to an IDE. Developers here use either QtCreator or Eclipse, based on their preference. However, for debugging and profiling purpose everyone uses Visual Studio (because it's really good at this task). To use it, we don't have any VS project, we just attach the debugger to the running program (or start the program from VS simply by opening the .exe).

So, our needs regarding the 3rd party libs is 1/ navigate the code in our preferred IDE 2/ view the symbols and the sources of a specific 3rd party when debugging/profiling.

For our first need, we just generate an IDE-specific config file that lists the paths of all the cpp_info.includedirs of all our conan libs (a .includes file, in the case of QtCreator). As you see, this gives us only the header files, but this is enough to enable the autocompletion and jump-to-declaration IDE features.

For the second need, we did setup a working solution. We don't rely on any IDE config file, but instead we use a little-known feature of the .pdb file format. It allows to embed in the pdb file a command that rewrites the source paths it contains (from the absolute source path on the machine that compiled the library, to the local source path on the machine that runs the program). I will describe in a next comment the implementation details, but the corner stone is to be able to find locally the location of the sources of the library.

This is the object of the issue I filed: in Conan pre-1.0, conan source was installing the sources at a known location, but in 1.0 conan source installs the sources in the working directory. This makes it impossible to rely on conan source to install the sources of a library in a known location, and thus for this pdb feature to work.

@memsharded pointed out that the sources retrieved by conan source, and the sources used for compilation may be different (in particular, auto-generated code). It is a valid concern, and for us the content of conan source is good enough. It also has the advantage of being available for free, there is literally nothing to add in the recipe to make it work.

One of the proposed solution is to unconditionnaly bundle the sources in the package created by Conan. For us this is not a viable solution, because the sources can be big. One of our 27 conan packages is Qt Base, which is a fraction of the full Qt distribution, and the sources already weight 100MB. Also, not everyone will have a need to step into the code of 3rd party libs, and when they do it's always one specific library. Being forced to retrieve all the source code of all the 3rd party lib ahead of time would be overkill.

I do think that the solution proposed by this issue (a dedicated "source package", that could be distributed just like the normal package but still be independent) is ideal. I also think that Conan could use the same approach to package the docs of a library: @sourcedelica mentionned the convention of packaging sources in a *-sources.jar file, but there is also a convention of packaging the docs in *-javadoc.jar.

olivren commented 6 years ago

Here are some helper functions that I include in all my recipes. The usage is very simple: just invoke patch_pdb(self) in the package method or your recipe. In your project's conanfile, you should import all the pdb files, so that they end up located next to their respective dlls. If the sources of a Conan package exist in the standard location (~/.conan/data/MyLib/1.0/me/stable/source), then Visual Studio will be able to step into the library code showing the source files.

This code relies on the Microsoft tools pdbstr.exe and srctool.exe.

I'd like to do the same for GDB at some point. It seems that this could be achieved by creating a .gdbinit file, contaning multiple set substitute-path commands.

def find_pdbs(base_dir):
    '''
    Iterate on all pdb files in `base_dir`, yielding their full path (start with `base_dir`)
    '''
    for root, dirs, files in os.walk(base_dir):
        for pdb in fnmatch.filter(files, '*.pdb'):
            yield os.path.join(base_dir, root, pdb)

# This is the `srcsrv` entry that will be written in Pdb files. This entry
# is interpreted by the debuggers in order to find, on the local machine,
# the files referenced in the Pdb file as absolute files on the compilation
# machine.
#
# The documentation of this entry can be found in the file `srcsrv.doc` of
# the package Debugging Tools for Windows. Basically, the compiler expects
# to fin the file at the location described by `SRCSRVTRG`. If it is not,
# then the debugger runs `SRCSRVCMD`, who is responsible to write the file
# to `SRCSRVTRG`. We do that using a python script, that copies a file it
# searches in the Conan's source directory of the package. This entry must
# list all the files that the Pdb references that we want to map, using the
# syntax absolutepath*something*otherthing. "absolutepath" must be exactly
# the path referenced in the Pdb (as given by `srctool`), and the rest can`
# contain anything. For the line abc*def, "abc" is assigned to %var1% and
# "def" is assigned to %var2%.
#
_srcsrv_template = """SRCSRV: ini ------------------------------------------------
VERSION=1
SRCSRV: variables ------------------------------------------
CONANPKGNAME={pkgname}
CONANPKGVERSION={pkgversion}
CONANPKGUSER={pkguser}
CONANPKGCHANNEL={pkgchannel}
CONANPKG=%CONANPKGNAME%/%CONANPKGVERSION%@%CONANPKGUSER%/%CONANPKGCHANNEL%
SRCSRVTRG=%targ%\%CONANPKGNAME%\%CONANPKGVERSION%\%CONANPKGUSER%\%CONANPKGCHANNEL%\%var2%
SRCSRVCMD=python -c "import subprocess, shutil, sys; root=subprocess.check_output(['conan', 'info', '--paths', '--only', 'source_folder', '%conanpkg%']).splitlines()[1][19:]; shutil.copy(root + r'\%var2%', sys.argv[1])" %srcsrvtrg%
SRCSRV: source files ---------------------------------------
{files}
SRCSRV: end ------------------------------------------------
"""

def _paths_to_srcsrv_source_section(paths, sources_dir):
    '''
    Convert a list of file paths referenced in a Pdb, to the "source files" section
    of the srcsrv section for this Pdb. Each line is composed of the full referenced
    path, a star, and the same path relative to `sources_dir`. Paths that are outside
    of `sources_dir` are ignored.
    '''
    lines = []
    for path in paths:
        if path.startswith(sources_dir):
            lines.append('{}*{}'.format(path, os.path.relpath(path, sources_dir)))
    return '\n'.join(lines)

def patch_pdb(conanfile):
    '''
    Find all the Pdb files in the package dir of the conanfile, and path them.
    The patch consist of adding a special entry `srcsrv` in the Pdb, using the Microsoft
    tool `pdbstr`. This entry contains the instructions for the debugger on how to map
    file names referenced in the Pdb to a local file.
    '''

    # The source files that will be referenced in the Pdb can come from the
    # Conan "source" or "build" directory, depending on the `no_copy_source`
    # attribute. The path is lowercased, because the Pdb format stores all
    # paths as lowercase apparently.
    if hasattr(conanfile, 'no_copy_source') and conanfile.no_copy_source:
        sources_dir = conanfile.source_folder.lower()
    else:
        sources_dir = conanfile.build_folder.lower()
    module = os.path.dirname(__file__)
    srctool = '{}\\bin\\srctool.exe'.format(module)
    pdbstr = '{}\\bin\\pdbstr.exe'.format(module)
    for pdb in find_pdbs(conanfile.package_folder):
        refs = subprocess.check_output([srctool, '-r', pdb]).splitlines()
        srcsrv_sources = _paths_to_srcsrv_source_section(refs, sources_dir)
        srcsrv = _srcsrv_template.format(pkgname=conanfile.name,
                                         pkgversion=conanfile.version,
                                         pkguser=conanfile.user,
                                         pkgchannel=conanfile.channel,
                                         files=srcsrv_sources)
        fd, path = tempfile.mkstemp()
        f = os.fdopen(fd, 'w')
        f.write(srcsrv)
        f.close()
        subprocess.call([pdbstr, '-w', '-p:{}'.format(pdb), '-s:srcsrv', '-i:{}'.format(path)])
        os.remove(path)
memsharded commented 6 years ago

Thanks very much @olivren for your detailed feedback. I think the PDBs stuff will be very useful. Not high priority, but hopefully in the future we are able to provide some helpers to manage debugging more easily, and this is gold information.

Regarding the source:

I do think that the solution proposed by this issue (a dedicated "source package", that could be distributed just like the normal package but still be independent) is ideal.

Do you mean the proposed example that explicitly create a "src" package and then use it as a build_requires? https://github.com/conan-io/conan/issues/1796#issuecomment-360303473

This can already be done as conan is. Would you be looking for any other improvement in conan? The thing is this won't be a general solution, because we can't force all packages to create a separate "source" package. So still retrieving the sources for packages that did an exports_sources or source() might be necessary.

olivren commented 6 years ago

No, sorry for the confusion. This quote was not about your comment, but about the original request by @vadixidav, or at least as I understand it. My comprehension of his proposal is the introduction of a "source package" alongside the existing "binary package", that could be optionally installed by a developer ("If there was a source component linked to the same package version I could optionally download as a developer..."). This "source package" would certainly be created by a recipe method similar to package, maybe package_sources, that would produce an independant artifact uploadable to a Conan server.

I tried to understand the proposal in your comment, reading the content of the zip, but I fail to understand the approach. I don't know anything to CMake, so maybe that's the problem :)

solvingj commented 6 years ago

Very interesting points @olivren . I want to suggest another idea that i realized was functionally equivalent to a separate "source binary" from our perspective (in a wide range of cases, but not all). Whenever the sources of LibraryA are needed to build LibraryB, Conan simply could execute the source() method of LibraryA on-demand, and eliminate all the engineering needed to add the concept of a source_binary. However, for cases when multiple packages depend on LibraryA in this way, it's substantially less efficient than having the sources downloaded only once. There are Pros and Cons.

I think that in either case, one implementation detail that will have to be addressed is where to put the sources of these dependencies. We are approaching the idea largely as an easier way to transition from git-submodules, and in that case the .gitmodules format simply maps dependencies sources to sub-directories. So, we might like to see Conan behavior mimic this in many respects, but that would require an additional field for each dependency to specify "destination subfolder". Food for thought.

olivren commented 6 years ago

A small note about the PDB feature I described:

This feature is what Microsoft calls "Source Server". It was designed with the idea that a PDB file should be able to retrieve the exact version of a specific source file being debugged, directly from a Version Control System. And indeed, Microsoft provides Perl scripts that patch PDB files, so they can retrieve files automatically from TFS, Perforce, SourceSafe, CVS and SVN. An important design consideration is that this system downloads source files lazily: each time the debugger steps into an unknown source file, it downloads the file at the exact version from the VCS. This is why Git and Hg are not supported, as they do not allow cloning only one file.

When the debugger downloads a file, it does so in a dedicated, temporary directory (the target file path is the variable SRCSRVTRG). So, the solution I presented simulates the download from a VCS by simply copying the file from its known location in the Conan cache dir, to this SRCSRVTRG path. But we can imagine a design that is a better match for this PDB feature: instead of asking Conan to download all the source file of a lib upfront, it could grow a dedicated command that downloads one specific source file for a given package (eg conan download_file src/myfile.cpp mylib/1.0@me/stable). This command would be exactly what the PDB expects to call when looking for a file (SRCSRVCMD).

Pros:

Cons:

vadixidav commented 6 years ago

@olivren Yes, my original proposal was to include optional components (like settings, which are global for a build) that correspond to separately downloaded packages of source. I even had a slightly difficult time parsing my comment because I didn't seem to understand Conan well at the time.

@memsharded I definitely think the easiest way to move forward is to effectively run source() and copy the exports like normal and then provide that source folder to the IDE in some way. Any other approach might work better if implemented, but only if implemented, and what we have right now is probably sufficient to do things like debug QT applications using the QT source. Generated or configured source might be a problem, but a temporary solution can just involve what packages can already do today to help developers get going.

@olivren It seems in my older comment I was going on about how Find.cmake files are often present in packages so they can be used by code that isn't Conan-aware, so Conan would never get an opportunity to hook the sources into their build. I don't think we should be concerned about that, and if we can pass some environment variable like SRCSRVTRG to MSVC to give it the source, then all of those problems might even go away too.

Later, I do think having a proper optional source "component" that can accompany each package would be nice, but it isn't strictly necessary to start using source.

sourcedelica commented 6 years ago

Agreed - for the use case of being able to provide sources to a debugger it would be great if conan source was capable of copying sources to the local cache source directory. That would be an easy win in the short term, with a longer term goal of a source component/package/etc.

memsharded commented 5 years ago

Conan 1.9 adds the self.cpp_info.srcdirs for package_info(). Will be mapped to CONAN_SRCDIRS variables in cmake in 1.9, but could be also used in other generators could probably benefit from them.

memsharded commented 1 year ago

A lot of things have happened since this issue was last active:

See for example: https://docs.conan.io/2/examples/extensions/deployers/sources/custom_deployer_sources.html

In any case, we have learned that the best for debugging is doing an actual local build, like conan install --build=pkg_to_debug* and that already allows to step-into that dependency.

So I am closing this as solved, please create new tickets for further questions.