tree-sitter / tree-sitter-go

Go grammar for tree-sitter
MIT License
310 stars 65 forks source link

adding make support #56

Closed mattmassicotte closed 2 years ago

mattmassicotte commented 2 years ago

This change adds a Makefile, and bindings to more easily build static/dynamic libraries for use in C-based languages. This is basically my first time making a Makefile. So, I just copied the one from the runtime and modified it. It was really close to what was needed.

This comes from some discussion I had in the main tree-sitter repo with @maxbrunsfeld.

One additional thing I want to bring up is that I also modify bindings.gyp. Specifically, I include the following:

  'xcode_settings': {
      'OTHER_CFLAGS': [
        "-arch x86_64",
        "-arch arm64",
        "-mmacosx-version-min=10.13",
      ],
      "OTHER_LDFLAGS":[
        "-mmacosx-version-min=10.13"
      ]
  },

This does two things. First, builds the library for both architectures supported by macOS. It also establishes a minimum OS version for the linker, which quiets warnings. (I suspect that tree-sitter is compatible with much earlier versions of macOS, but I've only tested down to 10.13.)

I'm not 100% sure this is possible to address via a Makefile, as I was unable to find a way to control these settings via environment variables. However, I know nothing about the npm build process, and just figuring out this part required a lot of Googling and experimentation. I just wanted to bring this up as it is related, and possibly something you'd prefer to address here.

mattmassicotte commented 2 years ago

Ok, changing to a plain compile step is excellent. Not only does this remove the NPM dependency, it also makes it trivial to customize the build as needed. I can now build a fat library for macOS like this:

CFLAGS="-arch arm64 -arch x86_64 -mmacosx-version-min=10.13" LDFLAGS="-mmacosx-version-min=10.13" make

This is great! It makes working with this parser libraries so much easier.

mattmassicotte commented 2 years ago

hmm, I just investigated some other grammars, and discovered that at least one (ruby) has a C++ source file. This Makefile does not work right in that case. Going to look into that a little closer, as I assume a standard file across all grammars is best.

mattmassicotte commented 2 years ago

Ok, this updated version compiles C++ files, and only if there actually are some, links libc++. I looked through a bunch more grammars, and some do use C++, but not all. Seemed annoying to require the extra link if it isn't necessary.

This setup should work for other grammars without modification, but I haven't looked through them all, so there might still be differences I haven't accounted for yet.

Sidenote: make is really complicated

maxbrunsfeld commented 2 years ago

Oh, and one other thing. Could you just briefly check the difference between -O2 and -O3 for this parser? I think I may have used -O2 in the past, because I observed that the runtime performance was the same, but the compile was faster. But that may no longer be accurate. I think just testing the runtime speed on one large source file would be thorough enough. Thank you!

mattmassicotte commented 2 years ago

Sure. How do I run the benchmarks? It's also worth checking -Os. That's been the default "production" setting for macOS for a long time.

maxbrunsfeld commented 2 years ago

I don't have a specific way to do benchmarking. Maybe just a quick comparison in your own app, however you're consuming these build artifacts? I would just time the parse of one big file. How about, if there's any measurable difference between -O2 and -O3 (and the compile time isn't annoyingly long), we do -O3, and otherwise -O2?

I'm open to -Os too; I just don't know whether it's available in all C compilers.

cfroystad commented 2 years ago

Regarding using -O3 or -O2, I found this talk informative (and somewhat sad). From their research, they conclude that the -O3 and -O2 performance difference is mostly indistinguishable from noise - at least for their chosen statistical significance level. (link to paper). Note that the paper is from 2009 and the talk from 2019, so things might have improved

maxbrunsfeld commented 2 years ago

I agree with Doug’s comments about CFLAGS. Could you make that tweak @mattmassicotte? Thanks for tackling all of these changes 🙌

mattmassicotte commented 2 years ago

@maxbrunsfeld yes, will address the suggestion. It makes a lot of sense.

As for benchmarking, I'm really not in a position to do what you asked easily. My use of tree-sitter never parses a file from beginning to end flat-out, and I have so many other optimizations piled on top of the core library that I'm afraid the benchmarking I do have won't apply. I propose that we keep the Makefile matching the runtime's optimization level. What do you think?

mattmassicotte commented 2 years ago

@maxbrunsfeld @dcreager flags updated!

maxbrunsfeld commented 2 years ago

Ok, if profiling it is onerous, I'm fine with just using -O3 for now, we can always come back and change that. Thanks for setting this up @mattmassicotte!

mattmassicotte commented 2 years ago

Thanks so much everyone! This is a huge quality-of-life improvement for using the system.