Closed Zlopez closed 3 years ago
It looks much like #142, but I don't know why it pops up again. Unlike requests
, aiohttp
does check content-length by default.
There are multiple other apps hit by this at the same time, and seems like all affected sources are hosted on SourceForge.
https://github.com/flathub/org.octave.Octave/pull/142
https://github.com/flathub/org.gnome.Evince/pull/47
https://github.com/flathub/org.gnome.Chess/pull/28
https://github.com/flathub/org.gnome.Books/pull/26
https://github.com/flathub/org.freedesktop.Sdk.Extension.texlive/pull/45
https://github.com/flathub/org.flightgear.FlightGear/pull/39
https://github.com/flathub/net.gwyddion.Gwyddion/pull/23
https://github.com/flathub/me.kozec.syncthingtk/pull/34
https://github.com/flathub/io.github.peazip.PeaZip/pull/5
https://github.com/flathub/im.riot.Riot/pull/175
https://github.com/flathub/com.sweethome3d.Sweethome3d/pull/22
https://github.com/flathub/com.chez.GrafX2/pull/5
And the invalid new checksum is the same one everywhere: 251a85b3bac687974f360d3796048c20ded3bf0bd69e0d1cfd1db23d013f89ed
Thanks @gasinvein for mentioning the affected MRs here :)
Any idea what has happened? Probably some SourceForge issue? Looking at flathub download cache, it seems like this has occurred before.
Source forge seems to have no permalinks to source code and/or employing some sort of blocking which results in the external data checker getting confused.
Given that download-sources
job with the incorrect checksum has completed successfully for Element, libsoup/libbcurl didn't detect any http error either. Which means that SourceForge did actually sent invalid response without any errors (no incomplete read or anything). How can we catch this case? Is it possible at all?
Iirc they send a valid page that is just HTML. At least that's what I got last time. So it's more of a figuring out how to use source forge than external a data checker issue.
Looks like this has happened again. https://github.com/flathub/io.github.peazip.PeaZip/pull/6 https://github.com/flathub/me.kozec.syncthingtk/pull/40 https://github.com/flathub/net.gwyddion.Gwyddion/pull/25 https://github.com/flathub/org.ardour.Ardour/pull/25 https://github.com/flathub/org.freedesktop.Sdk.Extension.texlive/pull/48 https://github.com/flathub/org.gnome.Books/pull/31 https://github.com/flathub/org.gnome.Evince/pull/50 https://github.com/flathub/org.octave.Octave/pull/156 https://github.com/flathub/org.srb2.SRB2Kart/pull/23 https://github.com/flathub/uk.co.ibboard.cawbird/pull/9
This hit me for the first time today :) https://github.com/flathub/org.tuxpaint.Tuxpaint/pull/15 – I wish we had the contents of the offending page, given that we could add a hack…
And the invalid new checksum is the same one everywhere:
251a85b3bac687974f360d3796048c20ded3bf0bd69e0d1cfd1db23d013f89ed
Oh, I missed this point before.
I know it's an awful kludge but maybe we should denylist this checksum.
Another idea:
https://github.com/flathub/flatpak-external-data-checker/blob/master/src/lib/utils.py#L116-L129
There is no check that the response code is 2xx
. I wonder if the response code would turn out to be 5xx
in this failure case.
There is no check that the response code is
2xx
. I wonder if the response code would turn out to be5xx
in this failure case.
aiohttp raises an exception by default when it receives status 4xx or 5xx.
We ask aiohttp to raise an exception on receiving response with error status here:
https://github.com/flathub/flatpak-external-data-checker/blob/435d2d5de3d72833d7300c0b0ccf2ff2bb637b64/src/lib/externaldata.py#L362-L363
I believe we have tests covering that.
I know it's an awful kludge but maybe we should denylist this checksum.
I'm considering this as the last resort. Maybe we could check Content-Type
header? Clearly it shouldn't be text/html
for an archive. Although would be good if we could check what's the Content-Type
of the erroneous sourceforge response.
We've captured the erroneous page contents. As of time of whiting this post, it's available in flathub download cache. The contents are:
<html><head>
<title>SourceForge</title>
<!-- <script src="/js/jquery.com/jquery-1.11.0.min.js"></script> -->
<script src="https://code.jquery.com/jquery-1.11.0.min.js"></script>
<script src="https://sourceforge.net/js/mirrors.js"></script>
<script src="/js/sf.js"></script>
<script>
var DR_loc = DR_parse_hash_url();
if (DR_loc) {
DR_sf_main(DR_loc);
} else {
window.location.href = 'https://sourceforge.net/home.html';
}
</script>
<meta name="description" content="Free, secure and fast downloads from the largest Open Source applications and software directory - SourceForge.net"/>
</head><body>
<noscript>
We're sorry -- the Sourceforge site is currently in Disaster Recovery mode. Please check back later.
</noscript>
</body></html>
Hi,
today the flathubbot opened a strange PR on org.gnome.Chess. PR is opened for dependency which doesn't have
x-data-checker
options set up and it just changed the hash without changing the URL. I looked at the URL and the file had last update in 2016.I will close the PR for now, because the build is failing anyway.