taskolib / libgit4cpp

C++ wrapper for libgit2 with limited functionality
https://taskolib.github.io/libgit4cpp/
GNU Lesser General Public License v2.1
1 stars 0 forks source link

Feature/remotes #18

Closed alt-graph closed 5 months ago

alt-graph commented 5 months ago

This MR adds functionality for dealing with git remote repositories and for pushing refs from a local repository to a remote one. It disables the previous, incomplete implementation of pull() and related functionality.

The core is a new Remote class that encapsulates functionality for dealing with a git remote repository. The new GitRepository member functions add_remote(), get_remote(), and list_remotes() make use of it. In addition, the previous implementation of push() is made more functional by allowing the user to specify both a remote and a refspec (saying what should be pushed where).

As a drive-by cleanup, all files generated by unit tests are now collected under unit_test_files/.

alt-graph commented 5 months ago

@wyrnat, @Finii, @MissPeppermintPatty: Does anyone have time to review this?

Finii commented 5 months ago

@wyrnat, @Finii, @MissPeppermintPatty: Does anyone have time to review this?

I can do sometime in the next 48h

Finii commented 5 months ago

I'm quite opposed to this commit

https://github.com/taskolib/libgit4cpp/pull/18/commits/978cb8c743e312003c3a4c2c84b9e94c6596f035 Remove unnecessary namespace qualifiers

which does (almost only) changes regarding Error like:

-throw git::Error{ "Git init failed" };
+throw Error{ "Git init failed" };

We have an Error class in all the libraries, which is imho already error prone, but at least we give the namespace to make clear what it is. In the actual server you might juggle doocs::Error, git::Error, and task::Error and I find that too confusing and hostile for grepping. It might be superfluous here (because this is the lowest in the chain), but in principle I would keep the namespace specifically for all the different Error objects, because they are too similar jet dissimilar - dropping the namespace raises instead of lowers the mental burden in my opinion.

Finii commented 5 months ago

Ah, I forgot, branch_remote_name and thus

is not directly needed anymore, but I believe the error handling introduced there should still prevail / be added. This PR adds a lot of unused return values. ;)

alt-graph commented 5 months ago

I'm quite opposed to this commit

978cb8c Remove unnecessary namespace qualifiers

which does (almost only) changes regarding Error like:

-throw git::Error{ "Git init failed" };
+throw Error{ "Git init failed" };

We have an Error class in all the libraries, which is imho already error prone, but at least we give the namespace to make clear what it is. In the actual server you might juggle doocs::Error, git::Error, and task::Error and I find that too confusing and hostile for grepping. It might be superfluous here (because this is the lowest in the chain), but in principle I would keep the namespace specifically for all the different Error objects, because they are too similar jet dissimilar - dropping the namespace raises instead of lowers the mental burden in my opinion.

I understand your point, but at least for libgib4cpp it does not make sense to me. The project has no knowledge of Taskolib or DOOCS and there are no conflicting Error classes anywhere. IMO repeating the namespace qualifier just adds visual noise and makes the code harder to read.

alt-graph commented 5 months ago

@Finii: I'll try to break your last comment apart to have threads that we can work on.

alt-graph commented 5 months ago

At least it should be called get_all_remotes() or similar.

I can live with get_all_remotes() as well, but I generally prefer using the most specific verb available instead of "get" if possible. So I still find list_remotes() quite descriptive, more compact, and therefore a tiny bit better than get_all_remotes(). Does anyone else have an opinion on that?

alt-graph commented 5 months ago

In that case we do not need the Remote class at all Yes. The need for that class is indeed a question.

I see it as a way of making the access to remote-repo related settings easier. In any case I prefer not to have to deal with the libgit2 C API directly.

Anyhow, a Remote object by itself is not too helpful as it does not even remember which repo it belonged to. If I want to handle multiple repos and their remotes that becomes a hurdle. I as user need to keep the association myself somewhere. I would not need to do that if the remote is an integral part of the repository object.

If the "back-link" to the parent repo is really important to you, we can add a member function for that. I just saw no need for it so far because we always have the original repo around anyway?

And then, why do we have the members AT ALL?


class Remote
{
/// Return a non-owning pointer to the underlying git remote object.
git_remote* get() const { return remote_.get(); }
std::string get_name() const { return git_remote_name(remote_.get()); }
std::string get_url() const { return git_remote_url(remote_.get()); }

private: LibGitRemote remote_{ nullptr, git_remote_free }; };


That is fair enough, but it will crash if `git_remote_name()`/`git_remote_url()` return a null pointer, as they can according to the libgit2 docs. I would actually prefer this version over the current "caching" scheme. I still find it better than throwing the bare libgit2 C API at folks.
Finii commented 5 months ago

Unfortunately we are now stuck with a thread-less conversation that can not be resolved?

At least it should be called get_all_remotes() or similar.

I can live with get_all_remotes() as well, but I generally prefer using the most specific verb available instead of "get" if possible. So I still find list_remotes() quite descriptive, more compact, and therefore a tiny bit better than get_all_remotes(). Does anyone else have an opinion on that?

I just think

Finii commented 5 months ago

I see it as a way of making the access to remote-repo related settings easier. In any case I prefer not to have to deal with the libgit2 C API directly.

:+1:

I just saw no need for it so far because we always have the original repo around anyway?

Well, usually 'is around' will prose problems later on. We can leave it as it is and add a backlink later on.

That is fair enough, but it will crash if git_remote_name()/git_remote_url() return a null pointer

Well, that might be prevented (like in the safe_string case above). And also we would need to include the case where remote_ is in fact nullptr.

prefer this version over the current "caching" scheme.

After some thought maybe we should return string_view instead for cases where no string is needed, e.g. comparisons with a string literal.

I still find it better than throwing the bare libgit2 C API at folks.

:+1:

Finii commented 5 months ago

IMO repeating the namespace qualifier just adds visual noise and makes the code harder to read.

image

The visual and typing clutter by prepending all files in tests/ with test_ is also considerable :grimacing: :grin:

It's include/*/Remote, src/Remote, and tests/Remote ah, no wait, tests/test_Remote :disappointed: At least for people working on the command line this is a hassle.

Not that this has anything to do with this PR.

alt-graph commented 5 months ago

The visual and typing clutter by prepending all files in tests/ with test_ is also considerable 😬 😁

It's include/*/Remote, src/Remote, and tests/Remote ah, no wait, tests/test_Remote 😞 At least for people working on the command line this is a hassle.

I agree with that.

alt-graph commented 5 months ago
* There might be users that just want a list of names, maybe to show in their GUI or something. Of course they could use the vector of `Remote`s and transform it into a vector of strings. The question is what is the more common use-case? And/or should we not support both? I can see use-cases for the string vector but I fail to see any for a `Remote` vector. Typically users would filter that again to find that one remote they are interested in?

Well, I have two remotes on several repositories and I have always found the output of git remote to be lacking information. Knowing that there are two remotes called "origin" and "github" is not as useful as getting a list of names and URLs as done by git remote -v. I guess that is why I prefer to return a list of objects and not just strings – the names are simply not enough information about a remote.

* Regarding the name, you are right `get` is used far too much. What about `fetch_remotes`? 

I would rather avoid that. "fetch" has a well-defined meaning in git-land.

list is in my eyes a more basic operation, and IF we have two functions (one for strings and one for objects), list should probably be for the strings?

I have a different take on that, I guess. IF we have two functions, I would prefer list_remotes() to return a list of Remote objects and list_remote_names() to return a list of remote names.

* Maybe I just fail to see how you intend to use `list_remotes()`. I can hardly see a use-case for a string list. What would someone do with `Remote` objects to all configured remotes? Except filtering out ONE of them by name - which is in fact just `get_remote()`

Show them to a user so they can select the one they want. And for that, knowledge of the URL is important.

Finii commented 5 months ago

Your last comment about git remove -v is also true.

Thanks for considering all my confuse suggestions.

I would crate a PR now that drops the Git from GitRepository, to change that asap, because that is a major break?