microsoft / winget-pkgs

The Microsoft community Windows Package Manager manifest repository
MIT License
8.75k stars 4.56k forks source link

Bot behavior - 404 handling #69497

Open Trenly opened 2 years ago

Trenly commented 2 years ago

Background

One of the activities of the bots is to scan the repo to ensure that installer hashes match and that URL's are not 404. This is fairly complex behavior, so to level-set and ensure that I am of the correct understanding, please let me know if I am missing any part of the intended functionality here or if it has been changed since the original implementation

Expected Behavior

1) Scan the installer URL's for the latest version of a package

With the above behavior, the latest version of a package will never be removed automatically, and community members will be notified that an update is required, decreasing the downtime for packages which are 404 and not able to be updated. Additionally, older versions of packages which are no longer working would be removed, ensuring the most accurate version list is available.

Discovery

I have a custom function I run which gets a URL response. This is the same function which is used in the YamlCreate script to determine if a URL is valid before accepting it. With this, I decided to do a test scan of the packages just out of curiosity, and found several 404's which are not the latest version.

Reproduction

These reproduction steps were created on PowerShell 5.1.19041.1682 and have not been tested for newer versions 1) Open PowerShell and install the powershell-yaml module - Install-Module -Name powershell-yaml 2) Clone the winget-pkgs repository and navigate to the directory in PowerShell 3) Collect all the installer files - $files = gci -recurse -File -Filter '*.installer.yaml' 4) Set up the Get-UrlResponse function by copying and pasting the code from the gist into the PowerShell session and running it 5) Loop across all the installers in all the manifests, testing the URL for a 404

$output = @(); $files | ForEach-Object {
Write-Progress -Activity 'Scanning Manifests' -CurrentOperation $_.FullName -Id 1001
$yaml = Get-Content $_.FullName | ConvertFrom-Yaml;
foreach ($installer in $yaml.Installers) {
Write-Progress -Activity 'Scanning Installer URL' -CurrentOperation $installer.InstallerUrl -Id 1002
$resp = Get-UrlResponse $installer.InstallerUrl
if ($resp.ResponseCode -eq 404){
$output += $_.FullName
Write-Host "404 at $($resp.Url)"
}
}
}

When doing this, I found that packages like Alibaba.Yuque had many versions which were 404. The latest version was not 404, but a majority of the older versions were.

A small portion of the output ``` 404 at https://github.com/TDesktop-x64/tdesktop/releases/download/v1.0.38/64Gram-setup-x64-1.0.38.exe 404 at https://github.com/TDesktop-x64/tdesktop/releases/download/v1.0.38/64Gram-setup.1.0.38.exe 404 at https://download.effie.co/effie/effie_setup_2.1.5.exe 404 at https://download.effie.co/effie/effie_setup_2.1.7.exe 404 at https://8gadgetpack.net/dl_340/8GadgetPackSetup.msi 404 at https://aimp.ru/files/windows/builds/aimp_5.03.2394_w32.exe 404 at https://yunpan.aliyun.com/downloads/apps/desktop/aDrive-2.2.7.exe 404 at https://yunpan.aliyun.com/downloads/apps/desktop/aDrive-2.2.8.exe 404 at https://yunpan.aliyun.com/downloads/apps/desktop/aDrive-2.3.3.exe 404 at https://yunpan.aliyun.com/downloads/apps/desktop/aDrive-2.3.4.exe 404 at https://yunpan.aliyun.com/downloads/apps/desktop/aDrive-2.3.5.exe 404 at https://yunpan.aliyun.com/downloads/apps/desktop/aDrive-2.3.6.exe 404 at https://dtapp-pub.dingtalk.com/dingtalk-desktop/win_installer/Release/DingTalk_v6.3.25.1209106.exe 404 at https://dtapp-pub.dingtalk.com/dingtalk-desktop/win_installer/Release/DingTalk_v6.3.25.1219101.exe 404 at https://app.nlark.com/yuque-desktop/Yuque-0.10.2.exe 404 at https://app.nlark.com/yuque-desktop/Yuque-0.10.2.exe 404 at https://app.nlark.com/yuque-desktop/Yuque-0.8.13.exe 404 at https://app.nlark.com/yuque-desktop/Yuque-0.8.13.exe 404 at https://app.nlark.com/yuque-desktop/Yuque-0.9.31.exe 404 at https://app.nlark.com/yuque-desktop/Yuque-0.9.31.exe 404 at https://app.nlark.com/yuque-desktop/Yuque-1.0.3.exe 404 at https://app.nlark.com/yuque-desktop/Yuque-1.0.3.exe 404 at https://download.ascension-patch.gg/update/ascension-setup-79.exe 404 at https://cdn.axis.com/ftp/pub_soft/cam_srv/cam_station/cam_station_preview/0_2130_58/AXISCameraStationSetup.exe 404 at https://issuepcdn.baidupcs.com/issue/netdisk/yunguanjia/BaiduNetdisk_7.17.5.19.exe 404 at https://issuepcdn.baidupcs.com/issue/netdisk/yunguanjia/BaiduNetdisk_7.17.6.2.exe 404 at https://issuepcdn.baidupcs.com/issue/netdisk/yunguanjia/BaiduNetdisk_7.18.0.15.exe ```

Related Items

denelon commented 2 years ago

We look at all the installers daily in each package version:

  1. If the package contains installers that fail with download failures or hash mismatch:
    • Create an issue
  2. If the package contains installers where their only failure is hash mismatch
    • Create PR to fix it with either the new version or update the hash
  3. If the package contains installers where their only failure is download failed:
    • If some installers are failing with less than 5 consecutive days -> do nothing
    • If all the installers are failing for 5 or more consecutive days:
      • If the version is the latest:
      • Create issue
      • If the version is not the latest:
      • Create PR to remove faulty installers or the entire version if all of them fail
denelon commented 2 years ago

We're going to investigate if there is a bug in our logic. The five-day period is to allow the ISV/Publisher/CDN to recover from a temporary outage.