Open net-wayfarer opened 3 years ago
+1 for checksums -- but FWIW, the issue with that some-builds tarball isn't that it's corrupt as much as that it's empty. All I get is ~200MB of null bytes!
$ hexdump -C ath10k-9984-10-4b/bisect/some_builds-9984-Q.tar.gz
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0cb690e0 00 00 00 00 00 00 |......|
0cb690e6
You're right, there appears to be no actual data in them, just null bytes. There's quite a number of files that are like that;
./ath10k-9984-10-4b/bisect/all_builds-9984-H-feb-2-2020.tar.gz
./ath10k-9984-10-4b/bisect/all_builds-9984-H-jan-28-2020.tar.gz
./ath10k-9984-10-4b/bisect/all_builds-9984-H-march-5-2020.tar.gz
./ath10k-9984-10-4b/bisect/all_builds-9984-may-14-2020-full-community.tar.gz
./ath10k-9984-10-4b/bisect/all_builds-9984-qcache.tar.gz
./ath10k-9984-10-4b/bisect/some_builds-9984-Q.tar.gz
./ath10k-10-4/all-images/all_builds.9980-full-community.tar.gz
./ath10k-9888-fw.tar.gz
./ath10k-9984-10-4b/ath10k-fw-beta/all_builds-9984b-xt-5-900.tar.gz
./ath10k-9984-10-4b/ath10k-fw-beta/all_builds-9984b-full-community-5-862.tar.gz
./ath10k-9984-10-4b/ath10k-fw-beta/all_builds-9984b-H-feb-6-2019.tar.gz
./ath10k-9984-10-4b/ath10k-fw-beta/all_builds-9984b-march-21-2019.tar.gz
./all_builds.tar.gz
./ath10k-4019-10-4b/bisect/all_builds-4019-xH.tar.gz
./ath10k-4019-10-4b/bisect/all_builds-partial-tH-4019.tar.gz
./all_builds.9980.tar.gz
./ath10k-10-4b/all_builds/all_builds-9980-wmi-commit-900+jan-11-2019.tar.gz
./ath10k-10-4b/all_builds/all_builds-9980-wmi-jan-3-2019.tar.gz
./ath10k-10-4b/all_builds/all_builds-9980-htt-dec-21-2018.tar.gz
./all_builds-4019.tar.gz
Could this be intentional to make the files look like they are taking up server space? or is this something to do with the lack of permissions, and thus one receives null data when trying to fetch it? Update: It appears these are possibly known as sparse files, and a possible simple way to find potential sparse files is through this method
EDIT:
./all_builds-9984b.tar.gz
appears to contain data but is apparently incomplete,
$ tar tvf ./all_builds-9984b.tar.gz
drwxrwxr-x greearb/greearb 0 2018-08-14 09:00 all_builds/
-rw-rw-r-- greearb/greearb 566840 2018-08-14 06:23 all_builds/firmware-5-htt-mgt-commit-711-8f30dfd.bin
-rw-rw-r-- greearb/greearb 544704 2018-08-14 08:15 all_builds/firmware-5-htt-mgt-commit-785-664fd58.bin
-rw-rw-r-- greearb/greearb 566504 2018-08-14 06:05 all_builds/firmware-5-htt-mgt-commit-685-512cbbb.bin
-rw-rw-r-- greearb/greearb 566848 2018-08-14 06:40 all_builds/firmware-5-htt-mgt-commit-730-2d8714a.bin
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
We lost a raid disk and had a bad time trying to recover some time back, so some files were lost. The 'all builds' things I can just delete and re-create as needed. Others are more difficult to repair, though evidently most of the firmware images should be somewhere on owrt cache servers. If someone wants to find those that matter and send them to me, I'll repopulate my web server.
Sorry to hear that. I have went ahead and grabbed the firmware files from openwrt downloads, in a bid to try and help with the process. Turns out, the downloads that I got from mirroring the releases, following a very specific --accept
pattern in which I want to only mirror ath* files, produced lots of duplicates. This is natural considering that it is unlikely the board or firmware files would be compiled differently, when multiple (platform) targets set for release.
Long story short, I have decided to share my findings (temporarily) on my own repo, due to the unlikeliness of these findings would be what is missing. In my repo, ath10k-ct-openwrt-firmware.tar.xz
is literally a XZ compressed tarball, containing only the firmware file itself, that is inside the candeltech_fw directory and includes SHA512SUMS
file. The openwrt_packages are all the ath* ipk files which contained the firmware binaries. These ipk files also contains board.bin which I did not bother extracting.
For those interested in re-creating what/how I went about on this, the instructions are loosely as follows:
wget -m --no-parent --no-host-directories --no-directories "https://downloads.openwrt.org/releases/" --accept "ath*.ipk"
mkdir unwanted ; mv ath[6,9]k* unwanted
mkdir non_ct ; find . -maxdepth 1 -type f ! -name "*ct*" -exec mv -t non_ct '{}' \;
find . -maxdepth 1 -type f -name "*.ipk" -exec sh -c 'tar zxf "{}" ./data.tar.gz -O | tar zxf - --wildcards "*firmware*.bin" --to-stdout > "${0/.ipk/.bin}"' '{}' \;
find . -maxdepth 1 -type f -name "*.bin" -exec mv -t test '{}' \;
Not entirely sure where else I could go possibly look.
[SUGGESTION] Please provide checksums for at least the firmware files offered on website.
First and foremost, I would like to thank the project owner for providing public access to their project. In addition, continually hosting a history of commits for virtually each and every changes to the firmware files.
The files hosted on Candelatech website lacks checksums despite having timestamps of when they were last modified. This last modification date is not particularly useful to know whether or not if a user downloads the file, its integrity could be ensured, and may help explain other inconsistencies. Firmware files, in particular can be very susceptible to corruption when the mechanism for transparency is lacking due to NDA agreement, which is understandable but of a different matter.
As an example, when downloading a specific compressed tarball from the website, in this case it is some_builds-9984-Q.tar.gz. The contents cannot be read:
When invoking
file
on the exact same file, it returns that it is not a validgzip
file:However, this is not the case with other compressed tarballs. Take for example the all_builds-9984-H-dec-7-2020.tar.gz in which the contents could be read, and thus extracted.
The example is reproducible in 10 out of 10 times when trying to enumerate the contents of
some_builds-9984-Q.tar.gz
compressed tarball after repeatedly downloading it from the website. This data corruption could have a potential cascading effect on firmware files that are stored as-is on the website.An example of where checksums are provided on the website can be found on a similar project, but an old website hosting (Intersil/Conexant) Prism54 firmware, alternative/archived link in case the website is down or unreachable.