Should we commit generated files?

LegalizeAdulthood commented 6 years ago

If lots of the files down in Lib/Chip are generated in an automateic way from chip descriptions and kvasir-specific extensions, should we be committing those files to the repo?

You might ask: why does it matter?

Cons to committing generated files:

I attempted to open the Kvasir repository in an IDE and the symbol indexing tools immediately started chewing on the ~770 MB of files in Lib/Chip, ground away for a long, long time and then eventually crashed my IDE due to resource exhaustion. That is one negative consequence of having all the generated files in the repo.

Another downside to committing generated files is that when we make changes to the code generator, it becomes necessary to regenerate all the generated files and commit them. This is non-trivial as I don't think the source locations for the chip description files are committed to the Kvasir repository. For instance, I'm going to work on a PR to switch from unsigned short and unsigned to std::uint16_t and std::uint32_t and that will leave all the generated files stale.

Pros to committing generated files:

It allows people to find that Kvasir supports their favorite chip by doing a part number search.

It allows people to start using Kvasir out-of-the-box if their chipset is already in the repo.

odinthenerd commented 6 years ago

yes I agree this is a problem. ideally, the user would be able to generate the correct files on the fly for the chip they are using. committing the files basically only saves us the dependency on python, which would otherwise be our only dependency, but I agree now that its worth it. I would like to hook the chip file generation into the conan package manager (handy that its also python) so that the user still doesn't have to understand how that works. Although we arguably only have brave power users now the goal is to target beginners with kvasir so we need to keep the usability easy.

LegalizeAdulthood commented 6 years ago

I've been thinking more about this and here are some ideas. If I'm using Kvasir on a production project, chances are I only care about one or perhaps a few of the chip definitions. The rest just gets in my way. In a simple evolution from our current situation, I could tell CMake which chip(s) I care about at project generation time so that CMake includes those relevant definitions in my generated IDE project. This would avoid my IDE crash/load time problem I described earlier on this issue.

Going further we could have the Kvasir repository only contain the tools for defining chip HALs (plus any hand-made HALs like the GBA one?) and have the CMake invoke the generation script based on the input chip descriptions to generate the HAL files for the chip. This means you'd need access to the chip descriptions. Is there any standard for how these chip descriptions are currently made available? Is there a web service you can query by part name to get descriptions? If there is nothing standard, then we've just shifted the problem from having to checkout many HALs to checking out many chip descriptions. I have no idea if that is more compact than the current 770 MB of HAL source files. However, it still leaves us with the situation of not having narrowed down the stuff to only those things I care about as a developer. The extra files are a distraction and clutter. I like to be tidy in my software.

Is it useful to clump HALs by chip family and/or manufacturer and have them be separate packages or repositories? Conan might be a useful way to split things up; git submodules and some CMake glue may also be a solution although that feels more homebew than conan.

odinthenerd commented 6 years ago

I think conan is the way to go, I just have to do the leg work ;)

kvasir-io / Kvasir

Should we commit generated files? #116