nothings / single_file_libs

List of single-file C/C++ libraries.
8.88k stars 584 forks source link

Question: Tool to make a single file lib out of a huge library? #149

Closed Photosounder closed 5 years ago

Photosounder commented 5 years ago

I've taken the idea of a single file library to a bit of an extreme in that when you resolve all the includes you get a giant header file and an even more giant C file. While this is hugely convenient for me because it means that I can take everything in it for granted, this gets in the way of producing single purpose single file libraries from my code or compact examples of code I want to demonstrate. For instance let's say I want to make a single file TIFF loader, then I'd have to manually put together not just the basic functions but also every other function, structure, define and global this requires. Like for instance my TIFF loader needs my raw file loading function, which needs my UTF-8 version of fopen, and my endianness functions, my compression functions which need both my bit stream reading function and my generic buffer functions which require my dynamic memory functions, my pixel format conversion functions which rely on a bunch of functions, and a bunch of these functions rely on a bunch of my vector functions, and who knows what else.

So ideally there would be a tool (or compiler? Maybe some preprocessor can do the trick?) so that I can specify which base functions I need, and then but going recursively through all the functions it calls, all the structs it uses and what not, put it all together into a single file that with little change could make a usable single file library function. Is there already a way to do such a thing? I know that GCC can be used to resolve includes, but I don't know of any way to remove unneeded functions.

cwoodall-logi commented 5 years ago

Hey Photosounder!

I would think about using amalgamate (https://github.com/vinniefalco/Amalgamate). Which basically compiles down files into 1 or 2 files. It seems to work rather well and I have seen it used rather effectively for libraries such as mpack.

Otherwise if you wanted to write your own tool maybe look into python clang. I did some python clang based struct serialization generation functions before, quite a fun little project.

http://cwoodall.com/blog/2018/02/24/using-clang-and-python-to-generate-cpp-struct-serde-fns.html

Photosounder commented 5 years ago

Thanks, amalgamate might do nicely for putting everything into one file, although that's the easy part, removing the unneeded functions would take a bit more work. Doing something off clang might work but that's probably beyond what I'm capable of doing.

monolifed commented 5 years ago

This is what Nuklear uses: http://apoorvaj.io/single-header-packer.html

Photosounder commented 5 years ago

This looks better than Amalgamate (I tried that and wasn't impressed by the level of manual work that it takes), however once again this does nothing about removing unneeded code. Here's an example that I had to do manually, I used to include the massive (2.3 MB) two file GLEW into my library for convenience, which slowed down compilation by about 20 seconds and added bulk to my executables. So I made a list of the small subset of symbols I actually need from GLEW, manually kept track of what they depend on and deleted everything I don't need to obtain a much more reasonable 28 kB result (glew_minimal.c and glew_minimal.h). But I had to do it manually, which took time and effort. Something that could automatically figure out which functions or parts of code are unused would be amazing.

data-man commented 5 years ago

Try improved amalgamate (Python).

r-lyeh commented 5 years ago

You can also roll your own with C or even batch/bash in a few lines :D (example: https://github.com/r-lyeh/stdstring.h/blob/master/redist/amalgamate.c)

Photosounder commented 5 years ago

(example: https://github.com/r-lyeh/stdstring.h/blob/master/redist/amalgamate.c)

This is simple but wouldn't work so well in some cases. But is there nothing beyond amalgamation? Does nothing strip unneeded code?

I vaguely started creating some parsing functions https://pastebin.com/hezNGXCm to group C code into symbols so that later it could be decided which are needed or not, but it will probably fall short of being good enough with most real world cases, not to mention that wouldn't work for the mess that is C++.

r-lyeh commented 5 years ago

but anyways... wont compilers strip unused code for us? :D

ps: for those ppl using vinniefalco's amalgamate, I added a few bits long time ago in this fork :o) https://github.com/r-lyeh-archived/Amalgamate/commits/master

Photosounder commented 5 years ago

Yes, compilers do that, except they won't give you stripped C code, at least not as far as I know. The idea is to either be able to do something like the minimal version of GLEW I mentioned earlier, or when presenting code that relies on my huge library to be able to have only the few tens or even hundred lines of code that are relevant and leaving out the tens of thousands of lines that are irrelevant. As I explained in the OP, imagine trying to make something like a simple TIFF loading library out of my current library, how do I do that expect by manually looking recursively for every function and structure needed?