grantjenks / py-tree-sitter-languages

Binary Python wheels for all tree sitter languages.
Other
149 stars 43 forks source link

How to tell what versions of included languages were used? #1

Closed Akuli closed 2 years ago

Akuli commented 2 years ago

It would be to see which commits of each included language repo were used in the build. Currently the only way to check this is to find the latest commit in the language repo's master or main branch at the time when py-tree-sitter-languages was built, and hope that they didn't force-push.

Knowing the used commit hashes would also be useful if one of the language repos gets compromised. In that case I would like to know whether the latest build of tree-sitter-languages included the malicious version of the language repo. This is IMO especially important for a project that bundles lots of dependencies, as malicious code in any one of them makes py-tree-sitter-languages contain that malicious code in a way that can't be fixed by simply deleting a package's version from pypi.

grantjenks commented 2 years ago

Yes, I agree that would be a good feature.

I think it’s fairly simple to implement. If there were a table of repos, versions, and shas then the build.sh script could simply clone the repos at the versions and check the shas.

How does that sound? Not really a priority for me right now so pull request welcome.

Akuli commented 2 years ago

I actually implemented something like that for my project, only to find out that I couldn't get it to work and I basically had to trust your builds instead :) I'll make a pull request.