Open Tookmund opened 9 years ago
Looks like a good idea to me.
As for a better way, I'd keep a patch file per directory (limited to the files in that directory), mirroring the complete (or the subset requiring patching) NetBSD source tree under releasetools/patches/. This directory shouldn't be stored inside the Git repository since it can be regenerated as needed with the patches_* tools.
To handle this, I'd make a couple of shell scripts to :
Or the way around. Not sure what would be the best.
To update Minix 3's tree, the procedure would go like this :
Making good use of the copied .git/ directory is recommended to keep sanity during step 3.
That's a much better idea! I'll begin working thorugh the patching right now. Based on my earlier work in #79 I'm going to use a whitelist instead of blacklist because there are less special cases that way and I've already built the whitelists.
I'm working on this in https://github.com/Tookmund/minix/tree/updateminix
Patches generated with releasetools/genpatches.sh
Now to figure out which are necessary and which aren't
Hi, before you spent too much time on this, a couple of questions:
### Setup
git clone git://git.minix3.org/minix minix
cd minix
git remote add netbsd git://git.minix3.org/netbsd
git fetch netbsd
### Compare two directories
git diff netbsd/master -- <some_path>
### Checkout a not yet imported directory
git checkout netbsd/master -- <some_dir_or_file>
I am asking, because we used to have a tool which based on a file would checkout the netbsd CVS sources, generate a patch for registered files & directories.
We moved out of this, as the list was poorly maintained, and the end result was the usefulness of the tool drastically dropped, while still generating work.
Now, this is me doing it, and I arguably know very well the whole tree, it's quirks and status.
I am saying this, as the current process allows for the whole tree to be resynchronized in a matter of days now, while not generating any overhead for the projects when we need to import or patch specific parts of the NetBSD sources. This is extremely important, as we do not want to slow down the day to day work, while keeping the method working.
From past experience, any kind of lists requires work to keep them up-to-date, and if something is not used as part of the day-to-day workflow, it usually breaks down simply because we introduce a change and forget to do the required updates as well.
While I would welcome a way to further improve the current situation, I am skeptical that the list-based approach will succeed as we already tried it. If you want to get a closer look at it, you should checkout the following branch: https://github.com/Stichting-MINIX-Research-Foundation/minix/tree/R3.2.0 and take a look at the following files:
What I think would be an improvement over the way I currently do it is to do something along the lines (commands from the top of my head, needs to be checked):
This brings down the step 5. to removing files which are no longer required (because they were removed/ renamed in NetBSD) and actually taking a look at files which we have patched.
Regards,
Lionel
I can see how that would be a lot of work to maintain some of this. What I was thinking of was more along the lines of a set of patches and a set of lists of directories known to work (a whitelist).
The initial set of patches will take a while to set up but should only need to be changed if a program changes significantly or a new program is added. @boricj suggested it not be stored in git, but we probably should because I cannot regenerate patches only containing minix-specific changes reliably. This would also reduce maintenance cost.
I have already generated the white lists of all programs that work on minix and those should only need to be updated if a new program is imported.
Since he will be the one maintaining it in the long-term and there is a lot work to be done upfront I will await @sambuc 's approval before continuing to work on this.
Jacob
On Jun 18, 2015, at 2:45 AM, Lionel Sambuc notifications@github.com wrote:
Hi, before you spent too much time on this, a couple of questions:
What this do which is not already available by doing the following:
Setup
git clone git://git.minix3.org/minix minix cd minix git remote add netbsd git://git.minix3.org/netbsd git fetch netbsd
Compare two directories
git diff netbsd/master --
Checkout a not yet imported directory
git checkout netbsd/master --
What kind of manual labor does your new method generates (in terms of maintaining the lists, etc) ? I am asking, because we used to have a tool which based on a file would checkout the netbsd CVS sources, generate a patch for registered files & directories. We moved out of this, as the list was poorly maintained, and the end result was the usefulness of the tool drastically dropped, while still generating work.
The first time I resynchronized with NetBSD it took me literally months (full time) to do it, the second time (84d9c62), it took me a couple of weeks, and I have done 90% of the job on the last easter extended weekend, so a couple of days. I think I need about one more week full time to finish that work. Now, this is me doing it, and I arguably know very well the whole tree, it's quirks and status.
I am saying this, as the current process allows for the whole tree to be resynchronized in a matter of days now, while not generating any overhead for the projects when we need to import or patch specific parts of the NetBSD sources. This is extremely important, as we do not want to slow down the day to day work, while keeping the method working.
From past experience, any kind of lists requires work to keep them up-to-date, and if something is not used as part of the day-to-day workflow, it usually breaks down simply because we introduce a change and forget to do the required updates as well.
While I would welcome a way to further improve the current situation, I am skeptical that the list-based approach will succeed as we already tried it. If you want to get a closer look at it, you should checkout the following branch: https://github.com/Stichting-MINIX-Research-Foundation/minix/tree/R3.2.0 and take a look at the following files:
tools/nbsd_diff.sh tools/nbsd_ports What I think would be an improvement over the way I currently do it is to do something along the lines (commands from the top of my head, needs to be checked):
checkout the minix sources git grep -ni minix | cut -d: -f1 | sort -u >modified_files series of git checkout from netbsd / overwrites from the new netbsd sources for the relevant directories for f in $(cat modified_files); do git checkout $f; done compare the tree with the new netbsd tree using meld, and resolve the conflicts as required. check the results works for all configuration. This brings down the step 5. to removing files which are no longer required (because they were removed/ renamed in NetBSD) and actually taking a look at files which we have patched.
Regards,
Lionel
— Reply to this email directly or view it on GitHub.
Hi, @Tookmund,
During the last rsync, I implemented some of the steps we spoke about as a small script, which dramatically lowered the overhead for files which are unpatched. In the long run it will be the vast majority, so this is a nice gain, as it allows me to keep my efforts for the ones which need it.
It is in the source tree as releasetools/netbsd-resync.sh
:
#!/bin/sh
: ${BUILDSH=build.sh}
if [ ! -f ${BUILDSH} ]
then
echo "Please invoke me from the root source dir, where ${BUILDSH} is."
exit 1
fi
if [ -z "${NETBSD_BRANCH}" ]
then
echo "NETBSD_BRANCH is undefined."
exit 1
fi
find . -type f | cut -c 3- | grep -v '\.git' | grep -v '\./minix' | sort -u > files.all
git grep -i minix | cut -d: -f1 | grep -v '\.git' | grep -v '\./minix' | sort -u > files.minix
diff files.all files.minix |grep '^<'| cut -c 3- > files.netbsd
while read file
do
git checkout ${NETBSD_BRANCH} ${file}
done < files.netbsd
This does not yet manage files from the NetBSD tree which were moved, removed or added. This is a use-case which is not so common, so I don't see any problem to review those as part of the patched (by us) files. The actions to take are also rather simple, so it doesn't take too much energy.
That said, if you can come up with a way of finding files which were moved, this would help, although I have no idea on how to do this. Keep in mind that even if moved, a file with patches from our side should not be replaced by the NetBSD one, as manual review is required in that case.
Regards,
Lionel
Sorry it's been so long; school and other stuff got in the way.
This script is awesome! I tested it out with an automated netbsd git repo I found ( https://github.com/jsonn/src ) and it seems work great! (I really didn't want to take the time to setup the cvs stuff)
That should really reduce the manpower required to resync the two and so maybe that could be done before 3.4.0? Looks like it was last synced in October 2015 which is kind of a long time if we want to follow the main branch of netbsd, which it looks like we are currently doing.
We could also sync to the stable or security branches and just patch stuff if there is a vulnerability. There hasn't been one since we last synced but there's certainly been a lot of work done in netbsd since then. Following release branches instead would require much less resyncing than if we wanted to constantly follow main.
I'm just concerned that we use basically all netbsd for userland but don't seem to sync up often. This could lead to vulnerabilities, bugs, and other general badness as time wears on and the projects get out of sync. I realize this is mostly a time thing, because the project doesn't have many people, but that's why I opened this in the first place.
There still certainly is a non-negligible amount of work involved in resyncing and I don't know enough yet to try the whole process myself. Is there any other major bottleneck to resyncing regularly or with stable branches?
I will try to look into finding moved files but I don't have a lot of time recently as can be seen from how long it took me to get back to this.
Let me just add experience about the NetBSD issue. tl;dr: use Fossil. Long story: I intended for a while to follow NetBSD source tree; CVS was not an option because I wanted to have (some) access to history; also I was not disponible enough, so cannot afford to synchronize it every few hours, as it is supposed to be used. So I found out Joerg's work about the git repository linked above, and I started to use it; it was great for a while, but after a couple of months of irregular activity, fetching did not work; the underlying reason was synchronisation issues; after investigation, I understood that the git repository was a by-product of transformation from CVS to Fossil source code manager, which is a tool more targeted at the purpose. So I switched the NetBSD online-reference repository from Git to Fossil (it even worked on my Windows machine, which is a net gain for me), and the synchronisations issues disappeared. Obtaining on-the-fly git copies of the Fossil repository is a task which is not cheap but is acceptable on modern hardware and connectivity if you do it from time to time. Alternatively, the couple of scripts which have been produced, most of them referenced in this thread, certainly could be adapted to Fossil replacing Git.
I did not investigate producing a MINIX3 port of Fossil client, but I do not believe it to be a big problem; start point is already there: http://pkgsrc.se/devel/fossil
P.S.: regarding the script just above: there is a "unsupported" feature of Fossil which could perhaps match git grep
: test-grep
The way we import NetBSD userland now makes it very difficult to update as all minix related changes are not stored in patch files anywhere.
My initial idea is that we should have a
minix/patches
directory where a subdirectory of the entire tree is stored and a patch file is created for each program. So for example:This would be a huge amount of effort, so I would like to get some feedback on this design before it is implemented. Is this a good idea? Is there a better way? Any ideas on how these patches could be kept in sync and applied?