fedora-infra / mirrormanager2

Rewrite of the MirrorManager application in Flask and SQLAlchemy
https://mirrormanager.fedoraproject.org
GNU General Public License v2.0
65 stars 49 forks source link

umdl: delete make_file_details_from_checksums #205

Closed adrianreber closed 7 years ago

adrianreber commented 7 years ago

The function make_file_details_from_checksums() looks for certain files which contain the checksums for other files. This is (better was) used to store the checksums of the ISOs in the database. Looking right now at the file_details table there are about 3000 entries and 900 are checksums for other files than repomd.xml. The checksums for files which are not repomd.xml are not used anywhere and therefore not required to be in the database at all.

Additionally the function make_file_details_from_checksums() is broken since Fedora 21 as the format of the checksum file (Fedora-Server-22-x86_64-CHECKSUM) has changed to:

SHA256 (Fedora-Server-DVD-x86_64-22.iso) = b2acfa7c7c6b5d2f51d3337600c2e52eeaa1a1084991181c28ca30343e52e0df

This format is not understood by the function make_file_details_from_checksums(). Since Fedora 22 no new checksums have been added to database and nobody seems to miss it. Therefore this patch removes all the make_file_details_from_checksums() code.

Signed-off-by: Adrian Reber adrian@lisas.de

adrianreber commented 7 years ago

This PR only deletes unnecessary and broken code.

@mdomsch, do you remember if the checksums of files not repomd.xml have been used anywhere?

mdomsch commented 7 years ago

They were used to get the checksums for the ISOs which is put onto the metalink. For example:

http://mirrors.fedoraproject.org/metalink?path=pub/archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso

returns a metalink which includes the hash.

<?xml version="1.0" encoding="utf-8"?> <metalink version="3.0" xmlns="http://www.metalinker.org/" type="dynamic" pubdate="Fri, 24 Mar 2017 19:59:41 GMT" generator="mirrormanager" xmlns:mm0="http://fedorahosted.org/mirrormanager">

1437065992 999292928 cc0333be93c7ff2fb3148cb29360d2453f78913cc8aa6c6289ae6823372a77d2 rsync:// mirrors.rit.edu/fedora-buffet/archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://mirrors.rit.edu/fedora/archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso rsync:// mirror.math.princeton.edu/pub/fedora-archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://mirror.math.princeton.edu/pub/fedora-archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://pubmirror1.math.uh.edu/fedora-buffet/archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso https://pubmirror1.math.uh.edu/fedora-buffet/archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso rsync:// pubmirror1.math.uh.edu/fedora-archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso rsync:// pubmirror2.math.uh.edu/fedora-archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://pubmirror2.math.uh.edu/fedora-buffet/archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso https://pubmirror2.math.uh.edu/fedora-buffet/archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://kdeforge2.unl.edu/mirrors/fedora-archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso https://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso

This was preferred over reading the ISO and calculating the hash directly in MM (faster by not reading from the disk, and is guaranteed to match the signed checksum file then too).

On Fri, Mar 24, 2017 at 2:35 PM, Adrian Reber notifications@github.com wrote:

This PR only deletes unnecessary and broken code.

@mdomsch https://github.com/mdomsch, do you remember if the checksums of files not repomd.xml have been used anywhere?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fedora-infra/mirrormanager2/pull/205#issuecomment-289123539, or mute the thread https://github.com/notifications/unsubscribe-auth/AAqDqqWtkdMw-ACUWpd_eGq9-jck2nj1ks5rpBsPgaJpZM4Moruy .

mdomsch commented 7 years ago

The better question then is, does anything use the metalinks to get the ISOs? That I can't answer. :-( If we don't publish them anywhere, likely no. The apache logs could tell you though...

On Fri, Mar 24, 2017 at 3:01 PM, Matt Domsch matt@domsch.com wrote:

They were used to get the checksums for the ISOs which is put onto the metalink. For example:

http://mirrors.fedoraproject.org/metalink?path=pub/archive/ fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso

returns a metalink which includes the hash.

<?xml version="1.0" encoding="utf-8"?> <metalink version="3.0" xmlns="http://www.metalinker.org/" type="dynamic" pubdate="Fri, 24 Mar 2017 19:59:41 GMT" generator="mirrormanager" xmlns:mm0="http://fedorahosted.org/mirrormanager">

1437065992 999292928 cc0333be93c7ff2fb3148cb29360d2 453f78913cc8aa6c6289ae6823372a77d2 rsync://mirrors.rit.edu/fedora-buffet/archive/fedora/ linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://mirrors.rit.edu/fedora/archive/fedora/linux/ releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso rsync://mirror.math.princeton.edu/pub/fedora- archive/fedora/linux/releases/20/Live/x86_64/Fedora-Live- Desktop-x86_64-20-1.iso http://mirror.math.princeton.edu/pub/fedora-archive/fedora/ linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://pubmirror1.math.uh.edu/fedora-buffet/archive/ fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso https://pubmirror1.math.uh.edu/fedora-buffet/archive/ fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso rsync://pubmirror1.math.uh.edu/fedora-archive/fedora/ linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso rsync://pubmirror2.math.uh.edu/fedora-archive/fedora/ linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://pubmirror2.math.uh.edu/fedora-buffet/archive/ fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso https://pubmirror2.math.uh.edu/fedora-buffet/archive/ fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://kdeforge2.unl.edu/mirrors/fedora-archive/fedora/ linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso http://dl.fedoraproject.org/pub/archive/fedora/linux/ releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso https://dl.fedoraproject.org/pub/archive/fedora/linux/ releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso

This was preferred over reading the ISO and calculating the hash directly in MM (faster by not reading from the disk, and is guaranteed to match the signed checksum file then too).

On Fri, Mar 24, 2017 at 2:35 PM, Adrian Reber notifications@github.com wrote:

This PR only deletes unnecessary and broken code.

@mdomsch https://github.com/mdomsch, do you remember if the checksums of files not repomd.xml have been used anywhere?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fedora-infra/mirrormanager2/pull/205#issuecomment-289123539, or mute the thread https://github.com/notifications/unsubscribe-auth/AAqDqqWtkdMw-ACUWpd_eGq9-jck2nj1ks5rpBsPgaJpZM4Moruy .

adrianreber commented 7 years ago

@mdomsch, thanks again for the insights.

@nirik, do you know if we ever used the metalink checksums for anything. Fedora 21 is the last release with working ISO metalink checksums.

nirik commented 7 years ago

I think all the links on getfedora.org use download.fedoraproject.org which just uses mm to redirect to a mirror. I don't think we have anything offering metalinks of iso images. Although that might be something that mediawriter could take advantage of...

adrianreber commented 7 years ago

I am in favor of removing the code as until know we did not use it and for something like mediawriter there are the -CHECKSUM files which are GPG signed.

Just had a look at mediawriter and it seems they are hardcoding sha256 in their release.json:

https://github.com/MartinBriza/MediaWriter/blob/master/app/assets/releases.json

Using metalinks for the ISO checksums sounds like a good alternative. But the GPG signed CHECKSUM files is probably also a good alternative.

I will open a mediawriter issue to see what they think about it.

adrianreber commented 7 years ago

https://github.com/MartinBriza/MediaWriter/issues/78

adrianreber commented 7 years ago

As mentioned in MartinBriza/MediaWriter#78 MediaWriter would be willing to use the metalink information. Closing this PR and will fix the checksum code instead of deleting it.

juhp commented 4 years ago

If /metalink?path=... is no longer used - wouldn't it be simpler just to remove it completely, or are there still envisaged use-cases?

Web API documentation is quite sparse btw - is there any easy way to see the full API?