Closed dlangille closed 4 years ago
I created a new branch: single commit
I want git-to-freshports-xml.py
to be invoked to do a range or to do a single commit
@skozlov404 You're better at Python than I am. Is my objective clear?
@skozlov404 You're better at Python than I am. Is my objective clear?
Yes, I think so. I'll take a look of what I can do about it in the next few days.
I created this helper script:
$ cat ~/scripts/helper_scripts/git-commit-single.sh
#!/bin/sh
COMMIT=$1
cd ~freshports/ports-jail/var/db/repos/PORTS-head-git/
echo git checkout master | sudo su -fm freshports
echo git reset --hard $COMMIT | sudo su -fm freshports
cd /usr/local/libexec/freshports/
echo /usr/local/libexec/freshports/git-to-freshports-xml.py --path /var/db/freshports/ports-jail/var/db/repos/PORTS-head-git --single-commit $COMMIT --output /var/db/freshports/message-queues/incoming | sudo su -fm freshports
I had to goto master, then checkout the one commit. I was trying to solve this issue:
$ ~/scripts/helper_scripts/git-commit-single.sh 24e0896aa09051022ef1aacc6776bb4f34312a65
Updating files: 100% (48550/48550), done.
HEAD is now at 24e0896aa090 math/maxima: Update to 5.44.0
Traceback (most recent call last):
File "/usr/local/libexec/freshports/git-to-freshports-xml.py", line 163, in <module>
main()
File "/usr/local/libexec/freshports/git-to-freshports-xml.py", line 123, in main
ET.SubElement(update, 'OS', Repo=config['repo'], Id=config['os'], Branch=str(repo.active_branch))
File "/usr/local/lib/python3.7/site-packages/git/repo/base.py", line 696, in active_branch
return self.head.reference
File "/usr/local/lib/python3.7/site-packages/git/refs/symbolic.py", line 275, in _get_reference
raise TypeError("%s is a detached symbolic reference as it points to %r" % (self, sha))
TypeError: HEAD is a detached symbolic reference as it points to '24e0896aa09051022ef1aacc6776bb4f34312a65
Seems to work as is now.
It happened again tonight. There is something else I have to do to reset the repo between each run.
[dan@devgit-ingress01:~/scripts] $ echo /usr/local/libexec/freshports/git-delta.sh | sudo su -fm freshports
2020.07.08 22:45:22 git-delta.sh started
2020.07.08 22:45:22 git-delta.sh repo is /var/db/freshports/ports-jail/var/db/repos/PORTS-head-git
2020.07.08 22:45:22 git-delta.sh XML dir is /var/db/freshports/message-queues/incoming
2020.07.08 22:45:22 git-delta.sh running: /usr/local/bin/git fetch origin
remote: Enumerating objects: 1744, done.
remote: Counting objects: 100% (1744/1744), done.
remote: Compressing objects: 100% (763/763), done.
remote: Total 1882 (delta 992), reused 1723 (delta 971), pack-reused 138
Receiving objects: 100% (1882/1882), 720.94 KiB | 6.93 MiB/s, done.
Resolving deltas: 100% (1001/1001), completed with 312 local objects.
From https://github.com/freebsd/freebsd-ports
3ea9051165d7..5811beec8423 master -> origin/master
41218bd95a62..4df9b768da2b branches/2020Q3 -> origin/branches/2020Q3
4a02e95ca6d8..618ecb46a899 svn_head -> origin/svn_head
2020.07.08 22:45:26 git-delta.sh running: /usr/local/bin/git reset --hard HEAD
HEAD is now at f2bfe60090b8 net-mgmt/unifi5: Update to 5.11.46
2020.07.08 22:45:34 git-delta.sh running: /usr/local/bin/git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 28157 commits, and can be fast-forwarded.
(use "git pull" to update your local branch)
2020.07.08 22:45:35 git-delta.sh STARTPOINT = ab24c4bd5dff
2020.07.08 22:45:35 git-delta.sh running; /usr/local/bin/git rebase origin/master
Successfully rebased and updated refs/heads/master.
2020.07.08 22:45:48 git-delta.sh running: /usr/local/bin/git rev-list ab24c4bd5dff..HEAD
... whole bunch of hashes not shown ...
36db40c11698a9 0d5f1c8c72d95bc46329694b2490099765002331
2020.07.08 22:45:49 git-delta.sh /usr/local/libexec/freshports/git-to-freshports-xml.py --path /var/db/freshports/ports-jail/var/db/repos/PORTS-head-git --commit ab24c4bd5dff --output /var/db/freshports/message-queues/incoming
Traceback (most recent call last):
File "/usr/local/libexec/freshports/git-to-freshports-xml.py", line 163, in <module>
main()
File "/usr/local/libexec/freshports/git-to-freshports-xml.py", line 123, in main
ET.SubElement(update, 'OS', Repo=config['repo'], Id=config['os'], Branch=str(repo.active_branch))
File "/usr/local/lib/python3.7/site-packages/git/repo/base.py", line 696, in active_branch
return self.head.reference
File "/usr/local/lib/python3.7/site-packages/git/refs/symbolic.py", line 275, in _get_reference
raise TypeError("%s is a detached symbolic reference as it points to %r" % (self, sha))
TypeError: HEAD is a detached symbolic reference as it points to 'f2bfe60090b840b6d99a3288c0b745843cefcfe1'
2020.07.08 22:46:02 git-delta.sh ending
[dan@devgit-ingress01:~/scripts] $
echo git reset --hard $COMMIT
Why do you do this in your script? It seems to me that can cause the problem you're seeing.
It was an idea from #3 and thought it would help. I'll try without.
Oh, that's in helper_scripts/git-commit-single.sh
- I was running git-delta.sh
which does a git reset --hard HEAD
--- based on #3 suggestions.
I'd say it doesn't even matter if the tree is dirty or not - the script processes the commits that already happened, so git reset
shouldn't be needed at all. The only thing that matters is that you do git fetch origin
beforehand so your tree is up to date
Oh, I get what's been suggested in #3 - since git-to-freshports-xml.py
used to only process all the commits from the specified one to the HEAD - if we move the HEAD to right above the commit we're specifying - this would give us the effect of processing the single commit.
Thing is, with --single-commit
and --commit-range
flags now implemented - we don't need to jump around the git tree anymore like that - by just using git fetch origin
and the proper flags to git-to-freshports-xml.py
we're now able to achieve everything required.
This is what I have now:
[dan@devgit-ingress01:~/scripts] $ grep GIT git-delta.sh
${GIT} fetch $REMOTE
${GIT} checkout master
STARTPOINT=$(${GIT} log master..$REMOTE/master --oneline --reverse | head -n 1 | cut -d' ' -f1)
${GIT} rebase $REMOTE/master
I removed logging etc from the above.
Hey @sarcasticadmin I thought you might have ideas here since this area of work was originally your suggestion.
One issue I keep thinking about: getting out of sync. We want to make sure the latest commit in our repo is the latest commit in the database. If it is not, we need to process what's in the repo before doing a fetch
.
I don't think that'll be difficult. Just a thing to be done.
@dlangille thanks, Ill take a look at this tonight and give you some feedback. At the moment Im not near my machine.
@skozlov404
I'd say it doesn't even matter if the tree is dirty or not - the script processes the commits that already happened, so git reset shouldn't be needed at all. The only thing that matters is that you do git fetch origin beforehand so your tree is up to date
Oh, I get what's been suggested in #3 - since git-to-freshports-xml.py used to only process all the commits from the specified one to the HEAD - if we move the HEAD to right above the commit we're specifying - this would give us the effect of processing the single commit.
Thing is, with --single-commit and --commit-range flags now implemented - we don't need to jump around the git tree anymore like that - by just using git fetch origin and the proper flags to git-to-freshports-xml.py we're now able to achieve everything required.
The idea of reset --hard HEAD
was just to make sure for any reason that the local master
was clean so that we could git rebase
since the rebase needs to have a clean tree before it can proceed.
git-delta.sh
was assuming that git-to-freshports-xml.py
just needed a starting point, so I figured we could leverage the remote local master
branch compared to the local master
from when we last synced to get our starting point and just pass that along to git-to-freshports-xml.py
to process the commits from the starting point to HEAD.
We arent jumping around the tree, just leveraging the lag from the last time we synced and then just fetching last copy of master
and comparing it to local master
and bring local master
up to date.
@dlangille it looks like your error is just a TypeError due to how that git library processes the commits. Pulling from your output above https://github.com/FreshPorts/git_proc_commit/issues/7#issuecomment-655797104:
TypeError: HEAD is a detached symbolic reference as it points to 'f2bfe60090b840b6d99a3288c0b745843cefcfe1'
It would seem that we need to handle HEAD
in a special way (python type is different) or just get the commit hash that HEAD
currently points to so we can make the ranging in git-to-freshports-xml.py
work correctly.
One issue I keep thinking about: getting out of sync. We want to make sure the latest commit in our repo is the latest commit in the database. If it is not, we need to process what's in the repo before doing a fetch.
In regards to this, I think this is a separate issue compared to whats being discussed here. I would think just checking the database for the last commit stored then comparing that to the starting point in git-delta.sh
. The starting point should just be 1 commit ahead of whats in the database, if not move the starting point to the the commit right after the database entry and let git-to-freshports-xml.py
process it all. We are guaranteed that the history will be in the right order as a no one in the portstree is rewriting history on master (this is not the case)
An aside: the current approach is processing only commits on master
.
We have the 2020Q3
branch to consider as well... the code isn't catering to that yet.
TypeError: HEAD is a detached symbolic reference as it points to 'f2bfe60090b840b6d99a3288c0b745843cefcfe1'
It would seem that we need to handle
HEAD
in a special way (python type is different) or just get the commit hash thatHEAD
currently points to so we can make the ranging ingit-to-freshports-xml.py
work correctly.
This came up today when trying to process a single commit. I think #14 will fix this long term.
[freshports@devgit-ingress01 /usr/local/libexec/freshports]$ ./git-single-commit.sh c5f3b87db914aacea80f6e55223c246aa2e14d43
2020.07.15 14:57:57 git-single-commit.sh started
2020.07.15 14:57:57 git-single-commit.sh repo is
2020.07.15 14:57:57 git-single-commit.sh XML dir is /var/db/freshports/message-queues/incoming
2020.07.15 14:57:57 git-single-commit.sh /usr/local/libexec/freshports/git-to-freshports-xml.py --path /var/db/freshports/ports-jail/var/db/repos/PORTS-head-git --single-commit c5f3b87db914aacea80f6e55223c246aa2e14d43 --output /var/db/freshports/message-queues/incoming
Traceback (most recent call last):
File "/usr/local/libexec/freshports/git-to-freshports-xml.py", line 172, in <module>
main()
File "/usr/local/libexec/freshports/git-to-freshports-xml.py", line 132, in main
ET.SubElement(update, 'OS', Repo=config['repo'], Id=config['os'], Branch=str(repo.active_branch))
File "/usr/local/lib/python3.7/site-packages/git/repo/base.py", line 696, in active_branch
return self.head.reference
File "/usr/local/lib/python3.7/site-packages/git/refs/symbolic.py", line 275, in _get_reference
raise TypeError("%s is a detached symbolic reference as it points to %r" % (self, sha))
TypeError: HEAD is a detached symbolic reference as it points to '44d4d38cf77e4718e2666128077516c05403e214'
2020.07.15 14:57:58 git-single-commit.sh ending
Looking at the ports tree:
[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ git status
HEAD detached at 44d4d38cf77e
nothing to commit, working tree clean
This is the most recent commit processed by FreshPorts.
When FreshPorts processing a commit, it must do a git checkout
. It needs the working copy of the repo to be as it was after that commit occurred. This script achieves that goal:
$ cat git-checkout.sh
#!/bin/sh
#
# $Id: svn-up-file.sh,v 1.1 2012-08-15 11:49:10 dan Exp $
#
# Copyright (c) 1999-2019 Dan Langille
#
# This script used to checkout a given commit via a git working copy
echo "num of params = $#"
if [ $# -ne 2 ];
then echo error invoking script $0 : usage $0 GITDIR REVISION \(e.g. $0 /usr/ports 1234\)
exit 1
else
GITDIR=$1
REVISION=$2
# we may not need this cd...
cd ${GITDIR}
echo "git checkout ${REVISION}}"
git checkout ${REVISION}
exit $?
fi
[freshports@devgit-ingress01 /usr/local/libexec/freshports]$
In the short term. a git checkout master
solved the issue:
[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ git checkout master
Switched to branch 'master'
Your branch is up to date with 'origin/master'.
[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ git status
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
[freshports@devgit-ingress01 /usr/local/libexec/freshports]$ ./git-single-commit.sh c5f3b87db914aacea80f6e55223c246aa2e14d43
2020.07.15 15:20:51 git-single-commit.sh started
2020.07.15 15:20:51 git-single-commit.sh repo is
2020.07.15 15:20:51 git-single-commit.sh XML dir is /var/db/freshports/message-queues/incoming
2020.07.15 15:20:51 git-single-commit.sh /usr/local/libexec/freshports/git-to-freshports-xml.py --path /var/db/freshports/ports-jail/var/db/repos/PORTS-head-git --single-commit c5f3b87db914aacea80f6e55223c246aa2e14d43 --output /var/db/freshports/message-queues/incoming
2020.07.15 15:20:51 git-single-commit.sh ending
[freshports@devgit-ingress01 /usr/local/libexec/freshports]$
@dlangille saw your tweet: https://twitter.com/DLangille/status/1285750793711292416 sounds like youve made some good progress!
[freshports@devgit-ingress01 ~/ports-jail/var/db/repos/PORTS-head-git]$ git status HEAD detached at 44d4d38cf77e nothing to commit, working tree clean
This seems to be due to the fact that when git-to-freshports-xml.py
exits with an error that it leaves the repo in a less than ideal state and in this case detached HEAD
. Im not sure why git-to-freshports-xml.py
actually needs to try to checkout each of these commits vs just leverage something like git show <SHA>
but it still looks like its having issues with being able to leverage the type of HEAD
, again this seems like its due to a mismatch of the actual type being used in python.
Anyway it sounds like youve got a path forward and thats good 😄
Im not sure why
git-to-freshports-xml.py
actually needs to try to checkout each of these commits vs just leverage something likegit show <SHA>
FreshPorts need both.
git show <SHA>
is used to create the XML which is then used to populate the database. This data only contains the facts of the commit.
After putting the XML in to the database, the database is then refreshed with data from the repo. There is more information in FreshPorts than that obtained from the commit. A number of make -V
commands are run to obtain various data, such as:
This information cannot be obtained from the commit log. It must be extracted from the files. Thus, we do a git checkout <HASH>
.
With subversion, there was no need to extract the data from the commit log. It was extracted from an email
With git, there is no git-specific email list yet. Therefore, we took the commit log
approach.
This change in the XML generation step eventually led to the creation of a second working copy of the repo, one for XML creation (git log
) and one for running make -V
(git checkout
).
Running a single commit can be done via:
[dan@devgit-ingress01:~/message-queues/retry] $ ~/scripts/helper_scripts/git-commit-single.sh 4e0c57e6b4740e970be8e2ff640bc7cd560d1b24
2020.07.24 12:20:20 git-single-commit.sh started
2020.07.24 12:20:20 git-single-commit.sh repo is /var/db/ingress/repos/freebsd-ports
2020.07.24 12:20:20 git-single-commit.sh XML dir is /var/db/ingress/message-queues/incoming
2020.07.24 12:20:20 git-single-commit.sh /usr/local/libexec/freshports/git-to-freshports-xml.py --path /var/db/ingress/repos/freebsd-ports --single-commit 4e0c57e6b4740e970be8e2ff640bc7cd560d1b24 --output /var/db/ingress/message-queues/incoming
2020.07.24 12:20:20 git-single-commit.sh ending
[dan@devgit-ingress01:~/message-queues/retry] $
Given
24e0896aa09051022ef1aacc6776bb4f34312a65
, I want to regenerate the XML file and resubmit it for processing.