Open Bilal2453 opened 2 years ago
Hi, thank you for this PR and for your patience. I am going to go through it one step at a time so that I understand it.
Overall I think it looks okay. But this is honestly outside of my understanding. I would legitimately have to rewrite all of lit to figure out how lit and git work.
The single file package problem is unfortunate.
Can you open issues or PRs for the potential bugs that you found? I think they are worth addressing.
The single file package problem is unfortunate.
It indeed is, the implementation does its best to solve it but, that is as far as it is possible with Git.
But this is honestly outside of my understanding.
So it is for almost anyone frankly, no one fully understands how Lit (except @creationix perhaps) works, so goes to Git, nor I expect anyone to spend two days reading Lit's source code and Git docs – like I have. Therefore I was thinking of reviewing this PR using its behaviour, heavily testing it to ensure it is behaving, and deciding upon that how the PR should be proceeded.
Although that of course won't tell us anything beyond "Is it working or not", which is sad.
Can you open issues or PRs for the potential bugs that you found?
Surely, I will be glad to fix those and continue working on this PR. Hopefully that will be on the next Sunday.
Maybe you can write a test file for this? We have limited testing on lit. Always room for more.
Nice work! I hope to be able to review this later.
A way of specifying a commit, or a branch, or even a tag is mandatory. I although have been very conflicted over this and over how I will be implementing it.
The convention used by npm is url#sha1
or url#tag
. We can use the same here.
On single-file packages, the git data model supports this only via annotated tags. When you publish to lit, the object is literally a git tag object. Unlike commit objects that can only point to a tree, tags can point to anything (commits, trees, or blobs) You may be able to fetch the tag via the git
CLI tool and then read the tag and find the blob using the local git files after fetching.
Internally implement Git protocol over HTTPS (smart HTTPS)
I've done this before in JavaScript, it's not too terrible. The hardest part is reading the packfile which I seem to remember we already have in the git implementation in lua.
Handle Git submodules (?).
This would be great, what use cases do you have in mind? Is the goal to treat submodules as symlinks basically so their contents are in-place? This could be a way to manage recursive dependencies using only git since luvit's require resolution algorithm looks for libs
and deps
locally first.
The convention used by npm is url#sha1 or url#tag. We can use the same here.
I will be using something similar here as well then.
On single-file packages, the git data model supports this only via annotated tags. When you publish to lit, the object is literally a git tag object. Unlike commit objects that can only point to a tree, tags can point to anything (commits, trees, or blobs) You may be able to fetch the tag via the git CLI tool and then read the tag and find the blob using the local git files after fetching.
Right, I knew that much. The problem here is the Git repo owner has to create a tag for each release manually and point it to the right blob, sounded to me quite inconvenient, will see what I can do about this later on.
I've done this before in JavaScript, it's not too terrible. The hardest part is reading the packfile which I seem to remember we already have in the git implementation in lua.
Nice. I think this should be my next task here, maybe before that make some tests to ensure default behavior isn't changed. I think it should be fairly straightforward.
This could be a way to manage recursive dependencies using only git since luvit's require resolution algorithm looks for libs and deps locally first.
Yep, that is what I had in mind. A way to basically do recursive dependencies. I can see that being quite useful if the said repo wasn't published to Lit or something along those lines, then you can just have it as a Git submodule. But at the same time I also have some serious concerns.
It will definitely complicate everything. Having to handle how submodules themselves work, make sure we don't run into recursive loops but yet handle legitimate recursive deps, makes it much harder to track what is currently installed, etc.
Dependency hell . This may very much create a dependency hell where some submodule depend on a ton of others that in turn depend on more. Some of which may already be locally installed, but we will have to install it yet again to satisfy the package expectations.
Version conflicts. I can imagine a scenario where some package wants a specific commit of a Git submodule, when there is a locally installed version of that package with a different version. Or two packages that are a git submodule dependency each want their own version of some package.
I will be holding off on this for now. But even in the future, I am not sure how should this look like. Seems like a quite complicated feature that may very well end up un-used/over-used.
cc: @creationix
The Pull Request
This is an initial PR of many planned PRs aiming at polishing the Lit experience. In this PR I wrote the basics of getting a
lit install https://github.com/author/repo
command working, as well as refactoring some of the code, and documenting it. I will go through every little change that I have made, and the expected behavior of such changes, as well as potential bugs that I spotted. But first I will explain how to use this implementation.How to use it
I wanted to keep this implementation as simple as it could be, to not confuse the end-user and keep it familiar to them. Therefor my choice was to use Git's URL schemes, similar to how
git clone URL
handle it (and similar to valueslit make
currently accept). Some examples of the command:lit install git@github.com:SinisterRectus/discordia
(over ssh)lit install https://github.com/truemedian/rethink-luvit
(over https)lit install https://github.com/Bilal2453/discordia-replies.git
(over https, single-file package)All of the above commands are expected to work. It will fetch the repo tree, calculate the dependencies, install the package correctly and then install the package's dependencies; Even if such dependencies are Git repos too. For example, a possible package.lua:
It is as well expected to work with any Git upstream, in the above example GitLab and GitHub are used.
How it works
While in the process of making this implementation small and simple, I decided to reuse most of the Lit logic regarding Git DB, this implementation will use the Git CLI to fetch the given repo into Lit's DB, then it queries the package in-db (just like Lit already does), then marks the package for installation. The installation process itself is untouched, the behavior from this point is identical to what Lit already does.
Single-file packages though, those have been very problematic to implement since you basically cannot tell a service (GitHub for example) to create tag pointing to the desired blob (how Lit currently does it), making us unable to determine which blob in the tree is the package file. I've been in front of two choices, either I don't implement it, or I search the tree for a Lua file, then if and only if the tree only has a single Lua file, use that as the package. I've decided that "something is better than nothing" and went ahead with latter method, I will have to document this behavior really well, to make sure the end-user will expect this.
Not yet implemented
TL;DR:
Handle Git submodules (?).
As you might have noticed, this make use of
git
binary, that decision was for three reasons, first islit make
already making use of that, second for simplicity and to match the friendly behavior of Git as much as I could, and third me being limited on time. I am planning to implement the Git protocol internally over HTTPS, this will be soon in a future PR.This implementation will fetch whatever latest default branch is. A way of specifying a commit, or a branch, or even a tag is mandatory. I although have been very conflicted over this and over how I will be implementing it. Since we are supporting all Git accepted URL schemes I am afraid that something similar to
lit install URL@tag
is going to conflict with the repo URL, and since we also have the SCP-like syntaxgit@host:
being also used. I basically am not sure what syntax to use for specifying a version(a commit hash, tag, branch, etc), and afraid of introducing a syntax that could potentially conflict with what Git expects/accept. I've thought about implementing aninstall-git
command just to solve this problem (with something likelit install-git repo version
, but I will leave deciding this up to you, or to future me.Lastly, Git submodules. As you can see from the commit history of this PR, I used to actually have
--recurse-submodules
thinking that was everything into handle submodules... apparently no, handling submodules requires a very specific implementation that I didn't even find at the Git docs, the only mention of it I found was this answer on StackOverflow explaining how it is implemented. It is not terribly hard but it also isn't a straightforward, being tide up to time, I decided to not implement it in this PR, although in future PRs I will be trying to... and perhaps failing.Potential bugs in Lit
While trying to understand how Lit works, I've had multiple "are you sure?" moments. None of which I fixed or even tried to. Here are ones I remember:
calculate-deps.lua. This indicates that
lit install author/
is allowed. And indeed when I tried it withlit install SinisterRectus/
it tried to do... something, not quite sure what it is trying to do (probably install first-to-match package) but it quickly fails with a somewhat unexpected error. Will be opening an issue about it.calculate-deps.lua. As I understand, this will try to process the dependencies of the package... what happens if two packages lists each other as a dependency? I haven't confirmed this but from what I am reading... it would be stuck in a recursive loop. No checks to prevent this from happening are provided.
core.lua. The field name has a typo, and this field is never used anywhere. Correcting it should break nothing...?
core.lua. I wanted to match Git's accepted scheme for fetch, and found this in core... although I could just take the pattern part and leave the handler one, the name of this field does not imply it is related to
lit make
nor I think it makes sense to expose it as such.Changes overview
Here I will list most behavior changes, additions, or deletions I've made to the existence code.
db.offlineLoadAny
. Because rdb patchesdb.load
, it makes it impossible to usedb.loadAny
locally without triggering the upstream matching. This solves it by re-implementing loadAny but with theoffloadLoad
reference.pkg.queryGit
. It is pretty similar toqueryDb
except in that it is offline only, and handles single-file packages differently (as mentioned previously).pkg.queryGit
, and to addhash
as a return ofqueryDb
.addDep
&processDeps
logic is merged together intoresolveDeps
.resolveGitDeps
that handles git calculations (and fetching).Those should be all of the changes introduced in this PR, hope I missed nothing. There should be 0 change over how Lit currently work. If you observe any, it is a bug.
Tests
To test this implementation I've made a really small repo at https://github.com/Bilal2453/lit-git-test/ which should be enough to show any odd behavior, it tests old dependencies resolution behavior, single-file package over Git, using multiple Git upstreams (GitHub and GitLab in this test), and rule excluding. Overall, it makes use of 3 upstreams, GitHub, GitLab and Lit.
Finally, feel free to give me your feedback, review the code and test it, and contribute to the implementation. I hope this is up to the task of making Lit even better.