Closed dsager closed 9 years ago
I was afraid of seeing this request arriving. I started the script without thinking of sharing it, so at the beginning it was just the folder | url
syntax. Then I extended it because I was needing upstream urls for some projects: it became folder | url | upstream
. I noticed this was a bad solution but, who cares, it's just a script for myself. So when I started sharing it, I knew this request would arrive one day.
Up to my knowledge there will be four problems:
The file format and its parsing The most complicated part. I choosed a really simple file format to simplify the parsing, because bash is not really a pleasant language to work with. Keeping a 1-line based configuration file is not really a good solution if we want to allow more than two remotes that can be named. So a new file format will be needed, which is not really a problem, but the parser that comes with is.
Working with remotes internally
Hum it will not be too difficult for this, there is currently one function that deals with this: git_check_branch_origin
and it can easily be adapted to handle the remote in argument.
Displaying all the informations
Will the informations be displayed? Or is it just to setup the remotes correctly? If it is just to setup the remotes, we can imagine to have a format like folder | url | name | name_url | name2 | name2_url [...]
which permit a really simple configuration file for standard users, and allows advanced users to set more remotes. If the goal is to display informations, like which branches are synced and which not, we need to rethink the display. And I have already to scroll for seing all informations of my repositories, could be boring to have too much informations.
Also I must say that the display part is not the best written in this script
Speed concerns
I have near 40 repositories in my config, and it takes nearly 600 miliseconds on a decent computer to show the status. On my home computer it is even more. Complicating the config format means more parsing, what will increase the global time needed for all operations.
This enhancement could be a nice feature, but it implies big changes that also come with some drawbacks. My humble opinion is that bash is too complicated to do complex parsing and too slow to do complicate analysis. For me, gws
will stay a really interesting portable solution which "just works ©", but will never get really "big" because of bash, sadly.
It is also why I'm currently starting to write an evolution of this software in a more maintainable programming language: I want to have more powerful config files, more speed, more options, etc.
For all those reasons I'm not gonna to implement this myself in gws
, but if someone arrive with some acceptable solution, I'd be happy to include it.
Thanks for your answer and reasoning, @StreakyCobra. I didn't get (or missed) the notification about the comment, so sorry for the late reply :) I agree that bash makes things a bit complicated and can understand that you want to keep the whole thing simple. Like it is you can run it out of the box without installing any dependencies...
Soonish I will get a new computer and will have to migrate, that might be a good time to look into this :)
What would you think of the following format:
FOLDER | URL_1 [NAME_1] [ | URL_2 [NAME_2] ] [ | URL_n [NAME_n] ]
The separator between URL and NAME is a whitespace (or any other char). NAME_1
and NAME_2
would default to origin
and upstream
respectively. Like this the old format would still work but the user can change the names if he wants. And at the end of the line he can add as many additional URLs as he wants. That the run time increases as you keep adding remotes should be obvious :)
No problem, I wasn't waiting on the answer anyway! And thank you for your interest and time!
The syntax you propose looks nice and keeps backward compatibility in an elegant way :-) . So about my 4 concerns:
The last point to think about is displaying information. Currently there is 2 levels, one for the repositories, and the second one for the branches. Do you already have an idea about the multiple remotes? Doing like now with upstream
and ignoring all remotes except origin
? Or adding multiple third levels for showing the status of each branch of each remote of each repository?
For me it makes more sense to only display and check origin
, because otherwise we will have to deal with other parts, like the return code of status
that says if everything is synchronized.
As I'm mostly interested in backup and restore I'd be totally fine with the status command only checking origin. Two possible enhancements:
gws status --all-remotes
But like I said, I'm fine with status only looking at origin. For more detailed info I would use git directly. The way I see it, gws
should help you to do simple bulk operations on multiple repos (create .gws file, init repos, simple status) and not create detailed reports :)
The development version 0.1.8
already allows to run a subcommand on a subset of repositories. I don't like the idea to have a difference if the command is run for one repository or several. But sure adding a flag for showing the full status is a good solution.
I'm also fine with just checking against origin
, because it is my usage. So let's wait for other people explicitly asking for it before starting to implement an unused feature ;-)
I just played around a little with the line parsing and came up with what you can see in the following gist:
https://gist.github.com/dsager/00cad170e0e752a3ca27
Obviously it's missing the default values origin
and upstream
and is not 100% compatible with your current code (using the cut
command), but it might serve as a starting point...
What do you think?
I'm not specially attached to use cut
. I use it because it was the obvious one for this purpose :-)
# We get the directory
DIR=$(cut -d${FIELD_SEP} -f1 <<< "$ROW" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')
# We get the rest of the configuration line containing remotes
REMOTES=$(cut -d${FIELD_SEP} -f1 --complement <<< "$ROW")
to get the directory and row instead of lines [6-10]?
# To be defined at the top of `gws`
URL_NAME_SEP=' '
# We get the first defined remote of the line
REMOTE=$(cut -d${FIELD_SEP} -f1 <<< "$REMOTES" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')
# We remove the current remote from the line for next iteration
REMOTES=$(cut -d${FIELD_SEP} -f1 --complement <<< "$REMOTES")
# We get its url
REMOTE_URL=$(cut -d${URL_NAME_SEP} -f1 <<< "$REMOTE" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')
# We get its name if any
REMOTE_NAME=$(cut -d${URL_NAME_SEP} -f2 -s <<< "$REMOTE")
# We can check if $REMOTE_NAME is empty, and if it is the case associate "origin" or "upstream", or throw an error
# ...
REMOTES
is probably better than ROW
as I wrote it. But that's cosmetic.hey, sorry for the late reply again :)
Yes, your approach works fine as well I guess. I just liked the idea of using as little external tools as possible (${string%%substring}
syntax vs cut
). But on the other hand cut should be available on any *IX system I guess :)
Hi,
hey, sorry for the late reply again :)
No problem!
But on the other hand cut should be available on any *IX system I guess :)
Yes, cut
is part of coreutils, and they are supposed to be installed everywhere: «These are the core utilities which are expected to exist on every operating system.»
I just liked the idea of using as little external tools as possible (${string%%substring} syntax vs cut)
You are right about having the smallest possible set of dependencies, but there is also a readability counterpart, which is my main concern here. If someone needs to understand or maintain this part in 6 months, it would be far easier to understand what is doing the cut
command — it is its main purpose — than some not-so-common bash syntax. There is a trade-off between readability and dependencies, and here I would prefer readability :-)
The develop
branch now contains a proposal solution tackling this issue. The file format is:
FOLDER | URL_1 [NAME_1] [ | URL_2 [NAME_2] [ | URL_n NAME_n ... ] ]
If NAME_1
is not specified, it is assumed to be origin
if NAME_2
is not specified, it is assumed to be upstream
At least one URL
must be associated with the name origin
Here are some points regarding my previous concerns:
.project.gws
or .ignore.gws
are modified)origin
update
command now create missing remotes (but don't modify existing ones)init
command is modified accordingly to create .project.gws
with extra remotesI tried a few cases and it seems to work. Can you try it too? If it works for you I'm planning to release the 0.8.0
version as there is already a few new features.
that's awesome, I'll give a try later on and let you know!
It seems to work just fine, at least the gws update
. Thanks a lot! I'm closing this issue!
It would be great if GWS supported multiple remotes instead of the two hard-coded defaults
origin
andupstream
.I often happen to work with additional remotes (e.g.
heroku
oroctopress
) and also have a different naming scheme in some repositories (e.g.mine
andorigin
instead oforigin
andupstream
). These are ignored bygws init
and I don't see any way to configure it manually.If this is considered a sensible feature request, I'd be glad to help with the implementation! Any thoughts on this? For example I'm not sure how this could be implemented in a proper way without messing up the current file format...