Closed dirkmueller closed 5 months ago
Rather than parsing repomd in a ruby module that is maybe less well maintained, have you thought about using libsolv instead? libsolv has support for parsing this highly efficiently and wouldn't have such compatibility issues.
repomd2solv path/to/repo | dumpsolv and parsing that might be significantly more reliable.
Thanks for letting us know Dirk.
As I understand both libsolv
and nokogiri
uses libxml2
as backend library. But performance for parsing file lists is not really an issue in RMT anyway, since most of the time is spend with downloading packages.
I looked into porting RMT to libsolv
and came to the conclusion, that with the current state of the libsolv
ruby bindings, the lacking documentation on libsolv itself (at least I failed to find a API documentation and had to browse the source) and the general different focus of the project (SAT dependency solving vs. brainless mirroring of files), Ivan's repomd-parser
is just more easy to use. Not to mention that the SWIG generation is far from optimal.
Why would you think that parsing the meta information first into .solv
dumps is more reliable than parsing the XML directly via libxml2/nokogiri
given RMTs use case of just iterating over all referenced files? Maybe I'm missing the point :)
The conclusion might change if libsolv
would ship with Debian repository support enabled but I could not find any information on the maintenance status for this code paths in libsolv
. Do you have insights?
Sadly it is disabled in current openSUSE/SUSE distributions.
From a maintenance perspective, Ivan (former SUSE employee and author of RMT) is usually pretty fast in reacting and if the time comes that the project is abandoned, I see no problem to fork and maintain the project within the SCC realms in the future.
+1 for adding zstd support to RMT. I have a customer that is using RMT to mirror Tumbleweed as a custom repository for their developers, and it stopped working last week. I just found out about this issue here. Should I open a formal case for the customer?
In case someone else needs this, this is the script I'm using on the RMT server itself to mirror the Tumbleweed repositories until this is solved.
#!/bin/bash
# Workaround script to download Tumbleweed OSS repository
# Basically, RMT does not support the recent move to ZSTD repodata, so it doesn't download the packages.
# https://github.com/SUSE/rmt/issues/1050
#
# Erico Mendonca <erico.mendonca@suse.com>
#
trap cleanup SIGINT SIGTERM
cleanup() {
echo "killing all wget instances, please wait..."
killall -9 wget
echo "Download stopped."
exit 1
}
MAINURL="https://download.opensuse.org/tumbleweed/repo/oss"
ARCHS="repodata i586 i686 x86_64 noarch"
REPODIR="/var/lib/rmt/public/repo/tumbleweed/repo/oss"
# stop running wgets, if any
killall wget
# note: I'm including repodata just for the sake of completion. RMT (still) downloads the zstd repodata correctly, just doesn't parse it.
for f in ${ARCHS}; do
echo "---> Mirroring ${f}..."
mkdir -p ${REPODIR}/${f}
cd ${REPODIR}/${f}
screen -dmS download-${f} wget -c -m -np -nH --cut-dirs=4 --reject '*.mirrorlist' ${MAINURL}/${f}/
cd ..
done
# wait for everything to finish...
while [ $(screen -list | grep -c download-) -gt 0 ]; do
echo "Waiting for downloads to finish... (next try: $( date -d "+10 min"))"
screen -list | grep download-
sleep 600
done
# cleanup the indexes
echo "Changing permissions on ${REPODIR}..."
chown _rmt:nginx ${REPODIR} -R
find ${REPODIR} -name index.html -delete
find ${REPODIR} -name robots.txt -delete
cd -
echo "---> Done."
Just place it into /etc/cron.daily and it should do the job. It's not the fastest way to do this (RMT is way faster), but at least it's downloading the directories in parallel.
Any news on this issue, @dirkmueller ?
@doccaz We're working on this at the moment https://github.com/ikapelyukhin/repomd-parser/pull/13
I've released repomd-parser
v0.1.6 to Rubygems. Please make sure that RMT's RPM package has zstd
library as a dependency.
This will released in the next upcoming RMT release 2.15
As part of SUSE Hack Week 23, openSUSE Tumbleweed switched to zstd for metadata compression. The underlying repomd-parser is not able to handle those files however, as it only expects
.gz
. With that missing, the mirroring of packages is failing.see https://github.com/ikapelyukhin/repomd-parser/issues/12 for more information