Closed HedvigS closed 1 year ago
When you run git submodule update --init
it will go into the submodules and checkout the commit the main repo wants (regardless whether there are newer ones available) If you have the grambank
submodule set to tag v1.0
, everybody downstream will also check out tag v1.0
.
If you want to know if any of the submodules in your local working copy got out of sync, you can either run git status
– in which case folders with an out-of-sync submodule will be marked as modified:
$ git status
On branch main
Your branch is up-to-date with 'origin/main'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: grambank (new commits)
no changes added to commit (use "git add" and/or "git commit -a")
Or you can run git submodules status
and check if there's a +
or -
sign in front of any sumodules:
$ git submodule status
448115168dee078aa2b0c54cc7064f4cb8c06018 autotyp-data (v1.0.1)
df1b79f38939f95c7288861a6743fd7a841d5779 glottolog-cldf (v4.5)
+ba44b4608a1176c0f23d3d478fdbfbc00bcb3c3f grambank (v1.0-42-gba44b46)
bc8a5f961013162ee1fb628d37c5ba0a8decdd28 wals (v2020.1)
And if there's a mismatch, you can just run git submodule update
to get everything back in order.
When it comes to the documentation, there might be two things we could add to the readme to make things clearer:
Maybe remind people to occasionally re-run git submodule update
after pulling (when the state of the submodule changed on the remote).
Maybe tell people about git submodule --recursive
in repos where this matters (I don't remember off of the top of my head if this is one of them).
Thanks @johenglisch I've got a follow-up question. I don't get a + which i take to mean that the submodule in my clone is up to date with its remote, that's right right?
skirgard@lingn06 grambank-analysed % git submodule status
448115168dee078aa2b0c54cc7064f4cb8c06018 autotyp-data (v1.0.1)
df1b79f38939f95c7288861a6743fd7a841d5779 glottolog-cldf (v4.5)
b9633e1e3c92ffde0e53567c5c97e82d6be969af grambank (v1.0)
bc8a5f961013162ee1fb628d37c5ba0a8decdd28 wals (v2020.1)
Now, what I need is that the grambank-cldf submodule in this repos links to a more recent version of grambank-cldf. Currently it's linked to @ b9633e1 which is a bit too old.
Uuuuhm, iirc it went like this
cd
into the folder of the submodulegit fetch
the latest changescd
back to project foldergit add
and git commit
the submodule folderIf I don't remember it correctly (which is entirely possible), here's the relevant section of the git user manual:
https://git-scm.com/docs/user-manual.html#submodules
The answer is probably in there.
Okay! When I just got to step 2 this is what I'm seeing:
skirgard@lingn06w grambank % git fetch
remote: Enumerating objects: 26, done.
remote: Counting objects: 100% (26/26), done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 26 (delta 16), reused 18 (delta 12), pack-reused 0
Unpacking objects: 100% (26/26), done.
From https://github.com/glottobank/grambank-cldf
ba44b46..5284cd9 master -> origin/master
* [new tag] v1.0-rc8 -> v1.0-rc8
* [new tag] v1.0.1 -> v1.0.1
Fetching submodule raw/Grambank
From https://github.com/glottobank/Grambank
* [new branch] Jaylatarche-patch-1 -> origin/Jaylatarche-patch-1
* [new branch] Jaylatarche-patch-2 -> origin/Jaylatarche-patch-2
* [new branch] Jaylatarche-patch-3 -> origin/Jaylatarche-patch-3
* [new branch] Jaylatarche-patch-4 -> origin/Jaylatarche-patch-4
* [new branch] Jaylatarche-patch-6 -> origin/Jaylatarche-patch-6
* [new branch] Jaylatarche-patch-7 -> origin/Jaylatarche-patch-7
* [new branch] Jaylatarche-patch-8 -> origin/Jaylatarche-patch-8
* [new branch] add-chan-glottcodes-from-cldf -> origin/add-chan-glottcodes-from-cldf
* [new branch] add-hueblerstability -> origin/add-hueblerstability
* [new branch] added-r-code-for-#2145 -> origin/added-r-code-for-#2145
* [new branch] guaz1234 -> origin/guaz1234
* [new branch] hojucha-gb291 -> origin/hojucha-gb291
* [new branch] jillsam-patch-2 -> origin/jillsam-patch-2
* [new branch] johnaell-patch-1 -> origin/johnaell-patch-1
e482b85e..8b28f1cc master -> origin/master
* [new branch] nataliia_wish_list -> origin/nataliia_wish_list
* [new branch] nuuu1241---nngg1234 -> origin/nuuu1241---nngg1234
* [new branch] passive-english -> origin/passive-english
* [new branch] revert-2406-hojucha-patch-2 -> origin/revert-2406-hojucha-patch-2
* [new branch] v1.0-maintenance -> origin/v1.0-maintenance
* [new tag] v1.0.1 -> v1.0.1
Which is a bit overwhelming because it's for the submodules downstream as well. Am i right in assuming that the commit at the top are the most recent, also.. this doesn't.. seem to be the most recent commits right? the most recent for grambank-cldf origin most update should be this one: https://github.com/glottobank/grambank-cldf/commit/5284cd940faec187d8adf270d3ed80fbbd0ce0f1
Sorry, I'm a bit confused still, trying to work through it.
First, git fetch
doesn't change your checkout at all (that's what git pull
would do). So the stuff you see here is just commits, branches and tags that have happened in the remote reposes and haven't been fetched to your local clones.
The bit ba44b46..5284cd9 master
says that for grambank-cldf
it needed to fetch commits from ba44b46 to 5284cd9 on the master branch. I.e. 5284cd9 is the latest on master - just as you expected. So git checkout master
and git pull
will checkout master
at 5284cd9.
First,
git fetch
doesn't change your checkout at all (that's whatgit pull
would do). So the stuff you see here is just commits, branches and tags that have happened in the remote reposes and haven't been fetched to your local clones.I know that. I was intentionally stopping at this state to evaluate what to do next.
The bit
ba44b46..5284cd9 master
says that forgrambank-cldf
it needed to fetch commits from ba44b46 to 5284cd9 on the master branch. I.e. 5284cd9 is the latest on master - just as you expected. Sogit checkout master
andgit pull
will checkoutmaster
at 5284cd9. thanks.
SOrry, no I see clearer the git id's. All good, it's clearer now than it was when viewing the git commit history via the web browser.
Things... still aren't really as I expect it. My main specific problem is that the submodule that's in grambank-analysed at the moment refers to a state of the grambank cldf repos where there is still a dir called R_grambank, and this has caused confusion among collaborators because they confuse those scripts there for the correct ones that are in grambank-analysed.
What I see now is that the dir R_grambank is somehow still there... but empty? I've deleted it now, going to see what I can do next.
I deleted the empty dir and git doesn't seem to have noticed the change.
skirgard@lingn06w grambank % git status
On branch master
Your branch is up to date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: raw/Grambank (new commits)
no changes added to commit (use "git add" and/or "git commit -a")
I'm interpreting that to mean that git doesn't care at all about empty dirs.
I think I did what I wanted? https://github.com/grambank/grambank-analysed/commit/b7f67f9b960920ea316827ef5ee8c53c0eff5c4d
Github desktop is now telling me that I've got a change I can commit re the submodule for the grambank folder which is changing it to:
Subproject commit 5284cd940faec187d8adf270d3ed80fbbd0ce0f1-dirty
I'm interpreting that to mean that git doesn't care at all about empty dirs.
Yes, that's the case.
Github desktop is now telling me that I've got a change I can commit re the submodule for the grambank folder which is changing it to:
Subproject commit 5284cd940faec187d8adf270d3ed80fbbd0ce0f1-dirty
You could just navigate into the submodule and run git diff
to see what the uncommited changes are.
Thanks.
Yeah so the CLI version is:
skirgard@lingn06w grambank-analysed % ls
README.md autotyp-data grambank wals
R_grambank glottolog-cldf grambank-analysed.zip
skirgard@lingn06w grambank-analysed % cd grambank
skirgard@lingn06w grambank % git diff
diff --git a/raw/Grambank b/raw/Grambank
index d2ff009..b32afb9 160000
--- a/raw/Grambank
+++ b/raw/Grambank
@@ -1 +1 @@
-Subproject commit d2ff009f761895d1304ab1b60999e12fe1e2a92c
+Subproject commit b32afb9393e1415fda564c757d29d7965b0b3f99
So the -dirty
flag is gone?
In CLI there's no diryt tag but it's still there in the GUI. The GUI should just be a point and click version of CLI Git, but now there seems to be a discrepancy... I can just unstage that change in the GUI?
So the glottobank/Grambank
submodule has changed, but you don't want these changes? If so, I'd say you can just
cd raw/Grambank
git checkout .
and be done?
More generally, though, I think that nested submodules are a bit difficult to maintain. Since analysis code is typically written for particular releases of data, it might be simpler to not use submodules here, but instead explicitly fetch released versions of the data from Zenodo or a GitHub release.
I wanted the grambank submodule to change, to switch to a more recent commit. That's what I was doing with the stuff earlier in this thread. I thought all was well and good with this commit b7f67f9 but now I'm a bit confused about this dirty tag that the GUI is showing me.
I thought that git submodule were a good fit for this kind of use case, especially since we don't have a zenodo release that collaborators can download from.
What I did now is I unstaged the dirty tag change, which is "discard" in the GUI. The CLI git then comes to this:
skirgard@lingn06w grambank % git status
HEAD detached at 5284cd9
nothing to commit, working tree clean
Ok, looks like the state you wanted, right?
Yes... it is. And.. I don't know what happened and I think I'm going to stop trying to find out :D!
Sometimes git is like god, works in mysterious ways ;)
I changed git ignore recently so that some contents of output gets pushed here. This was brought on my having to pull information from PDF tables from someone else work. I don't think we should push all of output, but a few select tables is a good idea.
This made us notice in a recent pr (#69) that we had diffs in a table we didn't expect. i highly suspect this has to do with checked out versions of git submodules between mine and Sam's clone.
I'm sorry @johenglisch to be having this discussion again. I just want to make sure we're doing it right so that it's all good.
We are only really concerned with pulling the right version of glottolog-cldf (v4.4) and grambank-cldf (v1). wals and autotyp are good if they are reproducible so it should be one particular version, but that could be the most recent one right now for example.
Currently in the readme for this repos we're saying to people to run this after cloning
git submodule update --init
but is that going to clone submodules of the same version as remote or the most recent version of those submodules elsewhere?
(I wanted to merge in #69 so I'm opening an issue here instead for a discussion we had there.)