Closed mattmassicotte closed 2 years ago
Ok, changing to a plain compile step is excellent. Not only does this remove the NPM dependency, it also makes it trivial to customize the build as needed. I can now build a fat library for macOS like this:
CFLAGS="-arch arm64 -arch x86_64 -mmacosx-version-min=10.13" LDFLAGS="-mmacosx-version-min=10.13" make
This is great! It makes working with this parser libraries so much easier.
hmm, I just investigated some other grammars, and discovered that at least one (ruby) has a C++ source file. This Makefile does not work right in that case. Going to look into that a little closer, as I assume a standard file across all grammars is best.
Ok, this updated version compiles C++ files, and only if there actually are some, links libc++. I looked through a bunch more grammars, and some do use C++, but not all. Seemed annoying to require the extra link if it isn't necessary.
This setup should work for other grammars without modification, but I haven't looked through them all, so there might still be differences I haven't accounted for yet.
Sidenote: make is really complicated
Oh, and one other thing. Could you just briefly check the difference between -O2
and -O3
for this parser? I think I may have used -O2
in the past, because I observed that the runtime performance was the same, but the compile was faster. But that may no longer be accurate. I think just testing the runtime speed on one large source file would be thorough enough. Thank you!
Sure. How do I run the benchmarks? It's also worth checking -Os
. That's been the default "production" setting for macOS for a long time.
I don't have a specific way to do benchmarking. Maybe just a quick comparison in your own app, however you're consuming these build artifacts? I would just time the parse of one big file. How about, if there's any measurable difference between -O2
and -O3
(and the compile time isn't annoyingly long), we do -O3
, and otherwise -O2
?
I'm open to -Os
too; I just don't know whether it's available in all C compilers.
Regarding using -O3 or -O2, I found this talk informative (and somewhat sad). From their research, they conclude that the -O3 and -O2 performance difference is mostly indistinguishable from noise - at least for their chosen statistical significance level. (link to paper). Note that the paper is from 2009 and the talk from 2019, so things might have improved
I agree with Doug’s comments about CFLAGS. Could you make that tweak @mattmassicotte? Thanks for tackling all of these changes 🙌
@maxbrunsfeld yes, will address the suggestion. It makes a lot of sense.
As for benchmarking, I'm really not in a position to do what you asked easily. My use of tree-sitter never parses a file from beginning to end flat-out, and I have so many other optimizations piled on top of the core library that I'm afraid the benchmarking I do have won't apply. I propose that we keep the Makefile matching the runtime's optimization level. What do you think?
@maxbrunsfeld @dcreager flags updated!
Ok, if profiling it is onerous, I'm fine with just using -O3
for now, we can always come back and change that. Thanks for setting this up @mattmassicotte!
Thanks so much everyone! This is a huge quality-of-life improvement for using the system.
This change adds a Makefile, and bindings to more easily build static/dynamic libraries for use in C-based languages. This is basically my first time making a Makefile. So, I just copied the one from the runtime and modified it. It was really close to what was needed.
This comes from some discussion I had in the main tree-sitter repo with @maxbrunsfeld.
One additional thing I want to bring up is that I also modify bindings.gyp. Specifically, I include the following:
This does two things. First, builds the library for both architectures supported by macOS. It also establishes a minimum OS version for the linker, which quiets warnings. (I suspect that tree-sitter is compatible with much earlier versions of macOS, but I've only tested down to 10.13.)
I'm not 100% sure this is possible to address via a Makefile, as I was unable to find a way to control these settings via environment variables. However, I know nothing about the npm build process, and just figuring out this part required a lot of Googling and experimentation. I just wanted to bring this up as it is related, and possibly something you'd prefer to address here.