Build process leaves some files missing

google / draco

Draco is a library for compressing and decompressing 3D geometric meshes and point clouds. It is intended to improve the storage and transmission of 3D graphics.

https://google.github.io/draco/

Apache License 2.0

6.46k stars 962 forks source link

Build process leaves some files missing #603

Closed jasonliu-- closed 3 years ago

jasonliu-- commented 4 years ago

Hello, Draco devs. I hope that everyone is staying safe and healthy during this quarantine time.

With all this free time, I am currently working on packaging Draco for the MacPorts package manager. I have gotten Draco to successfully compile from source, but I noticed some peculiarities when creating the MacPorts package.

Some files are missing from final product?

It appears that some files which are a part of the Draco source tree, and which you reference in the main README file, are not getting copied to the final install location. I believe that these files would be useful if they were included as part of a distributed package:

copy --recursive ${draco_src}/testdata -> ${CMAKE_INSTALL_PREFIX}/share/draco/
copy --recursive ${draco_src}/javascript -> ${CMAKE_INSTALL_PREFIX}/share/draco/

This would allow any users who install the Draco package through MacPorts to immediately run the command line examples provided in the "Command Line Applications" section of your README, e.g.:

/opt/local/bin/draco_encoder -i /opt/local/share/draco/testdata/bun_zipper.ply -o out.drc

I can certainly manually copy these folders using the MacPorts build system, but I believe that it would be more appropriate if these actions were done by cmake install directly. This is because if in the future someone else comes along and decides to package Draco for a different package management system (e.g. Debian, RPM, etc.), then those resulting packages would also benefit from these folders being copied over as part of the CMake build process.

Plugin library strangeness

Another thing I observed in the course of packaging Draco, was some strange behavior which I don't understand, regarding the Unity and Maya plugins. Perhaps someone could enlighten me as to why things are being done this way?

When using -DBUILD_[UNITY|MAYA]_PLUGIN=ON, the .bundle is getting compiled from source and getting put into ${CMAKE_INSTALL_PREFIX}/lib, as I would expect. However, it looks like precompiled plugins already exist inside the ${draco_src}/unity and ${draco_src}/maya folders? I understand having the option to compile from source, but then why also distribute precompiled bundles? And why is it, if I turn on -DBUILD_[UNITY|MAYA]_PLUGIN=ON, it only compiles the plugin for my one platform? Why not cross-compile the plugin for the other platforms at the same time?

If you want users to have the option of building the plugins from source, then I would propose a couple of possible choices which I think would make more sense.

Overwrite the precompiled one(s) located in ${draco_src}/[unity|maya].
Continue to put the compiled plugin into ${CMAKE_INSTALL_PREFIX}/lib as you are doing now, but blow away the precompiled plugin in ${draco_src}/[unity|maya] and create a symbolic link to the compiled one in lib. (Also, wouldn't it make more sense to put all of the compiled files like dracodec_unity.bundle, libdraco.dylib, etc. inside of the ${CMAKE_INSTALL_PREFIX}/lib/draco folder, instead of having some of the files inside the lib/draco folder and some in the lib folder?)

Perhaps it would make sense to add some new CMake options, such as INSTALL_[UNITY|MAYA]_PLUGIN, which will simply copy the ${draco_src}/unity or ${draco_src}/maya folder over to a shared location, such as ${CMAKE_INSTALL_PREFIX}/share/draco/plugins/, independent of the BUILD_[UNITY|MAYA]_PLUGIN options? This might make sense because as of right now, it seems that when I use the BUILD_[UNITY|MAYA]_PLUGIN option, certain files like DracoMeshLoader.cs are nowhere to be found in the final installed product. But if I just want to install the precompiled plugin without compiling from source, I think I would want DracoMeshLoader.cs to be available (at least that is what I think is being instructed by the README inside the unity folder). In other words, when using the new INSTALL_[UNITY|MAYA]_PLUGIN options, the cmake install phase should:

copy --recursive ${draco_src}/unity -> ${CMAKE_INSTALL_PREFIX}/share/draco/plugins/
copy --recursive ${draco_src}/maya  -> ${CMAKE_INSTALL_PREFIX}/share/draco/plugins/

tomfinegan commented 4 years ago

It appears that some files which are a part of the Draco source tree, and which you reference in the main README file, are not getting copied to the final install location.

In general our README is intended for developers. It includes basic usage information about Draco's tools, but the reliance on testdata in our example command lines is purely a coincidence based in convenience. It is not an implicit endorsement for files in the testdata subdirectory of the repository to be installed on user systems, and the usage of these files as examples in the README does not preclude the files being renamed, replaced, or removed. Relying on testdata as a packager is not something I would advise-- packaging scripts will break when testdata is changed or removed.

Anyway, to be clear, installation of Draco testdata is explicitly not something that we intend as a result of running the build system's install target on any platform. I also want to add that I'm not familiar with any packages that install testdata from the projects on which they are based. I'm not claiming exhaustive knowledge of packages, but it's certainly not something I've ever encountered before. Is this something that's starting to become commonplace? I have encountered situations where an install rule performs source installation, and in those situations a testdata directory would certainly be included, but Draco's install rule does not install sources-- only libraries, headers, executables, and packaging information files.

copy --recursive ${draco_src}/javascript -> ${CMAKE_INSTALL_PREFIX}/share/draco/

What would be the rationale behind installing the javascript subdirectory? In other words, the C++ based encoder/decoder/libraries are faster than the javascript equivalents, why would a macports user want the javascript? I'm not sure what the benefit of Installation of the javascript would be-- web applications should be using the gstatic URLs since they're faster.

To explain things a little: we only include the prebuilt javascript within the repository because it can be difficult to produce a working emscripten toolchain in some environments. It's targeted only at users that, for whatever reason, are unable to build the javascript tools.

precompiled plugins already exist inside the ${draco_src}/unity and ${draco_src}/maya folders? I understand having the option to compile from source, but then why also distribute precompiled bundles?

For the same reason we distribute the javascript tools: not everyone is a software developer with experience building C/C++ code. We believe a significant subset of users interested in the plugins would fall into that category, so we provide prebuilt binaries.

Why not cross-compile the plugin for the other platforms at the same time?

This is non-trivial and would require an external script driving our CMake build to accomplish in an automated fashion.

In general I am not opposed to the installation of the plugins as suggested, but I don't believe it should be a default behavior. Including a plugin in the installation automatically when BUILD<$PLUGIN> is enabled would probably be reasonable.

jasonliu-- commented 4 years ago

I also want to add that I'm not familiar with any packages that install testdata from the projects on which they are based. I'm not claiming exhaustive knowledge of packages, but it's certainly not something I've ever encountered before. Is this something that's starting to become commonplace?

Yes, it is becoming commonplace. I can give you one example immediately, since I'm working on packaging it for MacPorts right now as well: OpenSubdiv. There are CMake options for: NO_EXAMPLES, NO_TUTORIALS, NO_REGRESSION, NO_TESTS, etc. which by default are all set to false, meaning that all examples, tutorials, regression tests, etc. get compiled when building OpenSubdiv from source. This allows anyone building the OpenSubdiv library from source, whether it be a user, developer, or a packager, to execute the examples and tutorials to verify that binaries which use the OpenSubdiv library can actually run. This was useful for me, since some of the examples actually seg faulted, and I believe that going through and trying to resolve some of these segfaulting examples allowed me to put together a more robust package for distribution. Is the stuff in the testdata folder not meant to be there for users, developers, or whoever, to test that the Draco library is functioning as intended?

packaging scripts will break when testdata is changed or removed

Why would that be the case? I was under the impression that testdata files are simply used as either input files or output files against which one can compare the output from running the commands given in your documentation. If the testdata files are changed or removed from version to version, then shouldn't your documentation be updated to match? (i.e. if you delete a certain file from the testdata files, make sure that none of your documentation's examples reference those deleted files) But why would that break any packaging scripts? I'm not actually running any of the example commands in my packaging script....

What would be the rationale behind installing the javascript subdirectory?

I have to admit, I don't actually know anything about how to use Draco. The only thing I know is that there is a CMake option in Universal Scene Description which references an optional Draco plugin. Since I've been using my pandemic quarantine time to add some new packages to MacPorts, I decided "why not go ahead and package this Draco thing too"? The same with OpenSubdiv and USD. I have no idea how they work, but the only thing I know is that they are library dependencies for building Blender (which is my ultimate packaging goal).

I figured that if someone wanted to install the Draco MacPorts package, and then use the stuff in my package to develop some sort of JavaScript-based application that utilizes Draco in some way, that they might need the stuff in the javascript folder to get themselves started.

According to the MacPorts guide, and also from the MacPorts devs themselves, we should "Install Docs and Examples" (although that section is marked as "TODO", lol). So, in the spirit of providing as much of a "complete" package as possible, I thought that including the javascript folder would be a good idea, as part of a "everything including the kitchen sink" strategy. Is the stuff in that folder not intended for software developers to use? It's quite possible that I just don't know enough about Draco to be able to tell what is intended for developer use and what isn't, in which case I apologize.

To explain things a little: we only include the prebuilt javascript within the repository because it can be difficult to produce a working emscripten toolchain in some environments.

This is actually the case for MacPorts. As far as I can tell, MacPorts still does not have a working emscripten toolchain.

In general I am not opposed to the installation of the plugins as suggested, but I don't believe it should be a default behavior.

That's why I suggested adding the INSTALL_*_PLUGIN CMake options, which could act independently of the BUILD_*_PLUGIN options. By having them as options, you could easily set the default values to OFF so that they don't get copied over by default.

For the same reason we distribute the javascript tools: not everyone is a software developer with experience building C/C++ code. We believe a significant subset of users interested in the plugins would fall into that category, so we provide prebuilt binaries.

I agree, which again is why I think that adding the INSTALL_*_PLUGIN options, which simply copies the precompiled plugins over to the locations I mentioned, would be a good idea. In fact, such an option would probably be a good, perhaps even better, alternative for most users instead of the BUILD_* options... for those people, as you said, who might not have experience building C/C++ code, but do still want to use the plugins.

Maybe it would also be a good idea to add a INSTALL_JS_TOOLS option as well, which would copy over the javascript tools folder?

tomfinegan commented 4 years ago

Yes, it is becoming commonplace.

I don't know that a single example makes it commonplace, but I don't know much about OpenSubdiv or other components included in USD and Blender's dependency chains. I guess I could agree with installing bun_zipper.ply and car.drc, but dumping the entire test set on non-developer machines doesn't really make sense.

Why would that be the case?

The examples you cited, and any dependency you add in your scripts on those files would be broken if the files referenced were renamed or removed.

If the testdata files are changed or removed from version to version, then shouldn't your documentation be updated to match?

I theory, yes. In practice: documentation is often overlooked. Simple examples like those in the README can be overlooked. Downstream packaging scripts outside of our repository would simply break and have to be fixed. None of these things is insurmountable, but if I were in a packager's shoes I wouldn't be relying on testdata in an upstream repository.

Anyway, I don't think this belongs in draco's install rule, but if you want to include a file as a post install test within your package I think you could use bun_zipper.ply and get away with it. I mentioned the breakage on your side as an example of where this could go wrong, but in reality bun_zipper.ply almost certainly isn't going anywhere.

The only thing I know is that there is a CMake option in Universal Scene Description

USD (and anything depending on draco via USD) doesn't need any of the plugins. In fact, the USD support behind BUILD_USD_PLUGIN in the draco repo doesn't actually produce a plugin-- The flag should probably be dropped since all that it really does is enable BUILD_SHARED_LIBS. The existing install rule when configured with -DBUILD_USD_PLUGIN=1 (or more accurately, -DBUILD_SHARED_LIBS=1) is sufficient for the purposes of supporting USD.

The only thing I know is that there is a CMake option in Universal Scene Description which references an optional Draco plugin.

Regarding USD, I'm not sure how it's presently implemented, but our team assisted with the initial USD integration of draco. The USD build system download draco master at HEAD and builds from source unless configured otherwise. This appears to still be the case, see https://github.com/PixarAnimationStudios/USD/blob/master/build_scripts/build_usd.py#L1265.

From the USD side draco is consumed as a plugin, but that plugin build is part of USD's build system. It depends on the shared library build output by draco's build system, and requires the path to draco's includes. That's the hand off point between draco and USD:

draco produces: .(so/dylib/bundle/dll)
USD knows how to find draco's build products and includes.

According to the MacPorts guide, and also from the MacPorts devs themselves, we should "Install Docs and Examples" (although that section is marked as "TODO", lol). So, in the spirit of providing as much of a "complete" package as possible, I thought that including the javascript folder would be a good idea, as part of a "everything including the kitchen sink" strategy.

I think it's going a little too far once you begin installing things on a system that a user probably never needs. Again, the javascript tools are not useful to a desktop user. A web developer should be using the gstatic URLs, and someone using draco on the desktop should be using the compiled binaries yielded by the C++ targets. Other uses would seem to be cases where a developer is doing thing and "knows that they're doing"-- that subset of users is almost certainly working with the repo anyway, and installation isn't going to provide a benefit.

Is the stuff in that folder not intended for software developers to use? It's quite possible that I just don't know enough about Draco to be able to tell what is intended for developer use and what isn't, in which case I apologize.

The prebuilt stuff in the javascript folder is intended for users that aren't familiar with building software, or are unable to build the draco javascript targets for whatever reason. Those files aren't really useful on user systems.

The javascript/example subdirectory contains a README and examples for using the javascript in webapps. Installing the contents of javascript/examples in $install_prefix/share/draco/javascript/examples would be fine, but they won't work absent the full repository because of how the examples are written. To be clear, part of the reason for this is that these examples depend on the files in draco/javascript, but the example requires that the webserver be run from the root of the repository to function correctly, so installation is basically pointless.

That's why I suggested adding the INSTALL_*_PLUGIN CMake options

Why would these options be necessary if the default behavior for BUILD_PLUGIN were changed to add the build outputs to the install rules? As described above this wouldn't change USD, but when BUILD_PLUGIN really does yield a new target, the default behavior should probably be to install that target's output in $install_prefix/share/draco/plugins.

jasonliu-- commented 4 years ago

To be clear, currently my MacPorts package would be intended to be used by both users and developers. But it's also quite common for package management systems to make available two different versions of a particular library or piece of software, e.g. draco and draco-devel. One package would be the more commonly used one that most end users/desktop users would install (which would maybe be just the lib files, plugins, minimal docs for how to use the plugins), and the -devel package would be installed if someone wanted to do development against the library (which would have lib files, plugins, complete developer docs, examples, test files, etc.). I would be perfectly willing to do this for draco if you felt like one single package couldn't be made that would be able to cater to both users and developers simultaneously. Obviously, you adding the ability to target these two groups from within the cmake files would make life easier for any packagers like me.

The examples you cited, and any dependency you add in your scripts on those files would be broken if the files referenced were renamed or removed.

Downstream packaging scripts outside of our repository would simply break and have to be fixed.

Again, my scripts don't actually do anything with the testdata files other than copying the entire folder and its contents into $install_prefix/share/draco/. So regardless of how the files get changed/deleted/added by you draco devs, the package won't break. In my mind, the testdata files are there to allow users to verify that the commands they just installed as part of the package run as expected (maybe these people would be considered developers, in your mind). When I suggested that you could directly use the CMakeLists files to install testdata files, my thinking was that you, the draco devs, would be explicitly telling any packagers or ~users~ programmers that come along "hey, you might find these files useful if you're writing code against draco". Draco is your baby, I'm just wrapping it in clothes. :)

One more thing I should mention: the MacPorts devs very strongly recommend that packagers create their package's spec files to only use tagged releases, and to never, ever, ever clone the source code directly from the master branch of a project's repo. (And yes, this also applies to -devel versions of a package, because most programmers developing against a library do still need a stable version to write their code against.) So unless you were to somehow change the testdata files in a tagged release that you already published, and then somehow re-release it using the same version number, then the testdata files shouldn't change, correct? (If you were thinking that developers would be writing their code against master, then lots of package management systems would have yet a third version of the package for such a scenario, named e.g. draco-unstable or draco-master or firefox-nightly, which would explicitly clone the master branch of the project's repo.)

The USD build system download draco master at HEAD and builds from source unless configured otherwise.

And I am indeed configuring it otherwise. The Unix-style philosophy of packages is to create packages for each component separately, and then make one package a dependency of the other. That's what I'm doing here, as well: create a package for Draco, and then create a package for USD which depends on the Draco package (I can mark the dependency optional if the Draco<->USD linkage is an optional piece). And as you said, the USD cmake files and MacPorts build system should be able to find any precompiled shared libraries that it needs, without needing to compile Draco from scratch.

but they won't work absent the full repository because of how the examples are written. To be clear, part of the reason for this is that these examples depend on the files in draco/javascript, but the example requires that the webserver be run from the root of the repository to function correctly

I see. I wasn't aware of that, thanks for the clarification. So does this mean that if I wanted to create a draco-devel package for programmers to develop against, that I should be doing an in-source cmake build, and then copying everything (including source files) into the install locations (I don't even know how to get cmake to do that, lol)? Or, is there some way to generate a portable "root" folder which could be deployed to an already-running web server? So that a coder who installs the package could be told "now copy over this folder and its contents into the public_html folder on your Apache/Nginx/etc. web server, and you can begin developing your webapp there"? Usually when I hear the words "example requires that the webserver be run from the root of the repository to function correctly", the first thought that comes to my mind is "does this imply that absolute paths are being hardcoded somewhere"... is that the case here, or do I just not understand how emscripten+javascript works?

Why would these options be necessary if the default behavior for BUILD_PLUGIN were changed to add the build outputs to the install rules? As described above this wouldn't change USD, but when BUILD_PLUGIN really does yield a new target, the default behavior should probably be to install that target's output in $install_prefix/share/draco/plugins.

Yes, I had suggested this in my original message as a possible change that might make more sense ("Overwrite the precompiled one(s) located in ${draco_src}/[unity|maya]", with the implication that the ${dracosrc}/[unity|maya] folder would then get copied into the share/draco/plugins folder during the install phase). Adding the `INSTALL*_PLUGIN` options was an alternative to this solution.

tomfinegan commented 4 years ago

To be clear, currently my MacPorts package would be intended to be used by both users and developers.

From my perspective this doesn't make a ton of sense. Up top you say this, but then below you say this:

to never, ever, ever clone the source code directly from the master branch of a project's repo

Why would a developer working on draco want a tagged release? A developer using/consuming draco could, but I don't see that as adding much value given the tiny subset of users that 1) work on things that depend on draco, and 2) use MacPorts.

packagers or users programmers that come along "hey, you might find these files useful if you're writing code against draco". Draco is your baby, I'm just wrapping it in clothes. :)

If I'm a developer and I want to work with or on $X I'm going to initiate a web search and look at the project page for $X. I'm not going to open a terminal and run 'port install $X' or 'port install $X-devel' and hope that the package is up to date and configured for my needs.

testdata files are there to allow users to verify that the commands they just installed as part of the package run as expected (maybe these people would be considered developers, in your mind

That's a misconception. The testdata files are there for draco developers to verify the functionality of their changes during unit testing and before landing new changes. They are not provided for users to verify their installations.

I continue to disagree with installation of the testdata and javascript. It feels like we're just talking in circles here. I've mentioned that I'm flexible on installation of the plugins-- if you want to narrow the focus of this down to plugin installation that's fine, but as of right now I don't see anything in draco that's a technical blocker to your packaging needs. You do not need testdata to install the draco libraries and header files, and the existing install rule is sufficient for USD's requirements. The same goes for the javascript.

dblatner commented 3 years ago

Hey @jasonliu-- , @tomfinegan ,

the idea to make a Macports package is awesome! (As the alternative brew is not compatible if someone uses Macports)

But I agree that it should include only the necessary C++ based libs, because at the end the "users" are also developers -> they want to install the package quickly and then use and integrate it with top-speed performance in their apps.

@jasonliu-- : what is missing to create a working Macports package of draco?

tomfinegan commented 3 years ago

@dblatner: Not much, if anything, but if you're interested in packaging draco I would hold off on doing so until the next release-- we're working toward one now, and the CMake and pkgconfig files are going to change between now and the next release.

For some background:

The bulk of the discussion in this issue is a long conversation about philosophical differences related to test data in the draco repo-- unless truly compelling reasons can be provided the draco install target will never install test data from the draco repo onto the user's system. So far such reasons have not been provided.

I understand the desire for a verification step during package build and installation: If a packager desires to run tests prior to installation on the target system I would suggest that any package recipe ensure googletest is downloaded alongside draco and that the draco tests are built and executed as part of the package recipe. Instructions for running the draco tests can be found in https://github.com/google/draco/blob/master/BUILDING.md#googletest-integration

The portion of this issue related to installation of optional plugin targets has already been addressed. When the plugin build(s) are enabled, the install step will install the plugin(s).