KSP-CKAN / CKAN

The Comprehensive Kerbal Archive Network
https://forum.kerbalspaceprogram.com/index.php?/topic/197082-*
Other
1.98k stars 347 forks source link

Spec extension proposal: "downloads" #487

Closed AlexanderDzhoganov closed 1 year ago

AlexanderDzhoganov commented 9 years ago

I propose that we implement an extension called 'downloads'

"downloads" : {
            "description" : "A list of alternative URLs where a mod can be downloaded by tools",
            "type"        : "array",
             "items"       : { "type" : "string" },
             "format"      : "uri"
}

For now this will mostly allow mirorred ckan metadata to continue to function if the mirror goes down among other things.

Ippo343 commented 9 years ago

This issue is now particularly important at times like this when Kerbalstuff is down, making most mods in the index unavailable for download.

TeddyDD commented 9 years ago

Maybe it's better to change the type of "download" field to array in next spec version?

Ippo343 commented 9 years ago

I agree with TeddyDD: imho, "download" should take an array of download locations to try.

pjf commented 9 years ago

My brain hasn't properly spun up today, but things to consider with regards to multiple download locations are:

  1. What if the files are different in different locations?
  2. Internally, how do we cache these? Right now we cache by URL, but if we have multiple URLs for the same file, we need a way to check if we've already got the file from an alternate URL.
  3. What happens with mods which share a download path, and those have ended up with different download lists? This is especially common when a mod has been split into assets and config.

For 1, implementing checksums (#62) would help verify that what we downloaded is what we expect. For 2 and 3, one could always consider the first URL to be the canonical URL, and files are cached as if they were downloaded from there, even if they were not.

Point 3 could be potentially relieved by allowing mods to reference another mod's download section, rather than having to provide its own.

Although as mentioned in #489, the spec and CKAN core itself will never read, use, or validate any key starting with "x_", as they're specifically reserved for extensions which are not in the core, so this ticket could do with a rename (or a new ticket created) if we want to discuss extending the download field itself.

TeddyDD commented 9 years ago
  1. CKAN could recognize cached files by checksum if there would be more than one url in downloads (I assume checksums would be mandatory)
pjf commented 9 years ago

I assume checksums would be mandatory.

That would be nice, but also a barrier to humans writing the metadata themselves.

Having said that, caching by checksum if it exists is most definitely a good idea, and should be implemented if we add checksums to the spec.

Ref #62 .

pjf commented 9 years ago

935, while not superseding this, may certainly remove much of the need. :)

TeddyDD commented 9 years ago

That would be nice, but also a barrier to humans writing the metadata themselves.

This is simple to solve. We just need to add checksum calculation tool to netkan.exe (something like netkan.exe -checksum file.zip Also Linux users already have sha1sum etc.

netkan-bot commented 9 years ago

Hey there! I'm a fun-loving automated bot who's responsible for making sure old support tickets get closed out. As we haven't seen any activity on this ticket for a while, we're hoping the problem has been resolved and I'm closing out the ticket automaically. If I'm doing this in error, please add a comment to this ticket to let us know, and we'll re-open it!

dbent commented 8 years ago

May as well re-open this as it's getting particularly relevant.

dbent commented 8 years ago

My thoughts:

dbent commented 8 years ago

Also, deprecate $kref so that a single mod can use multiple sources to automatically generate multiple download URLs.

Right now we have x_netkan_jenkins and will soon have x_netkan_github. The mere presence of these properties in a .netkan should indicate we want to use these sources.

For example:

{
    "x_netkan_jenkins": {
        "url": "https://ksp.sarbian.com/jenkins/job/ModuleManager/"
    },
    "x_netkan_github": {
        "user": "sarbian",
        "repo": "ModuleManager"
    }
}

Would generate multiple download URLs, one from Jenkins and one from GitHub. This would be trivial to do with how NetKAN's transformers work.

EDIT: Dealing with priorities may be a bit tricky though.

EDIT2: Thoughts on handling priorities:

Each source would specify a priority number, e.g.:

{
    "x_netkan_jenkins": {
        "url": "https://ksp.sarbian.com/jenkins/job/ModuleManager/",
        "priority": 1
    },
    "x_netkan_github": {
        "user": "sarbian",
        "repo": "ModuleManager",
        "priority": 2
    }
}

During transformation each source transformer would adds it download URL, size, and checksum to a temporary array:

{
    "x_netkan_downloads": [
        {
            "url": "https://jenkins.example/ModuleManager.zip",
            "checksum": {
                "sha256": "0123456789abcdef"
            },
            "size": 123456,
            "priority": 1
        },
        {
            "url": "https://github.example/ModuleManager.zip",
            "checksum": {
                "sha256": "0123456789abcdef"
            },
            "size": 123456,
            "priority": 2
        }
    ]
}

Then at the end will be a transformer which takes the x_netkan_downloads array and produces an ordered downloads array from it sorting them by the priority given in each. Multiple sources having the same priority would be logged as a warning. This download transformer would also check that each source has the same checksum and size. If they don't agree it would be an error and generation of the .ckan would be aborted. If they do agree it would write out the download_size and checksum that would be identical for every source.

EDIT3: An open question is what we should do if one download source fails during .ckan generation, abort the whole thing?

TeddyDD commented 8 years ago

@dbent :+1: