pf4j / pf4j-update

Update mechanism for PF4J
Apache License 2.0
69 stars 39 forks source link

Validate md5sum if given in PluginRelease #19

Closed janhoy closed 6 years ago

janhoy commented 7 years ago

After adding #17 I am thinking about another improvement.

What if plugins.json for each release, in addition to the url property could also support a new md5sum property. If that is not set, it will work as today, but if the md5sum is set then the framework would check that the downloaded file matches.

One benefit is to validate that the download was complete and not corrupt. Another is that when you trust the repo site hosting plugins.json file but not necessarily each download host where the artifacts reside. Imagine a central plugins.json listing a bunch of 3rd party plugins, but only as a reference, not hosting the binaries. A group of trusted committers would accept PRs against the plugins.json file, test the plugin and then merge the PR. If the contribution contains an md5sum, the community will know that they can trust that binary stays the same.

Ideally such a feature would be integrated inside the FileDownloader but currently that interface only takes a URL:

public Path downloadFile(URL fileUrl)

A solution could be to add a second argument DownloadOptions:

public Path downloadFile(URL fileUrl, DownloadOptions options)

where DownloadOptions could be a bean with a builder pattern, e.g.

file = downloader.downloadFile(myUrl, DownloadOptions.create().md5Validation(md5sum).basicAuth("john:pass"));

If null there would be no options, and it can be extended over time without changing the interface. Then the UpdateManager's installPlugin() and updatePlugin() methods could have a variant that includes md5 and pass it down.

Do you see the security benefit for this, and is it an acceptable design? Alternative ways? I could cook up a PR.

decebals commented 7 years ago

Definitely, the idea is good. In my mind the implementation is based on Decorator/Wrapper pattern without to change the actual signature of FileDownloader (it contains only a method Path downloadFile(URL fileUrl)). The FileDownloader implementation knows about how to validate the downloaded file.

For example if I want to use checksum on each downloaded file, I can create a class ChecksumFileDownloader similar with:

public class ChecksumFileDownloader implements FileDownloader {

    private FileDownloader fileDownloader;

    public ChecksumFileDownloader() {
        this(new SimpleFileDownloader());
    }

    public ChecksumFileDownloader(FileDownloader fileDownloader) {
        this.fileDownloader = fileDownloader;
    }

    @Override
    public Path downloadFile(URL fileUrl) throws PluginException, IOException {
        Path dowloadedFile = fileDownloader.downloadFile(fileUrl);
        // call "verifyChecksum" method

        return dowloadedFile;
    }

}
janhoy commented 7 years ago

How do you then convey the md5sum property from plugins.json to the verifyChecksum method? The checksum is different for each url, so it cannot be passed in ChecksumFileDownloader constructor in the way I did for my ApacheChecksumVerifyingDownloader, which finds the checksum file to download based on the fileUrl.

Since the FileDownloader interface is still new and unreleased it figured the least intrusive would be to add a generic DownloadOptions param to the signature. Or we could add a more generic Map<String,String> options argument but that is a bit too open ended and error prone to my taste.

Of course an alternative would be to place the md5 validation inside UpdateManager.downloadPlugin() and then validate checksum for the file that was just downloaded. It would still let users override the behavior, but I think it's not as nice as keeping download validation in one place, the FileDownloader

decebals commented 7 years ago

How do you then convey the md5sum property from plugins.json to the verifyChecksum method?

For each plugin file (zip or jar) I have a file with the same name but with suffix .md5 or .sha1 that contains the checksum (text file). To get this file I use fileDownloader.downloadFile(fileUrl + ".md5").

For variant with DownloadOptions how I could scale (add new custom options or remove the options that don't make sense for my Downloader). Can your add a link to your ApacheChecksumVerifyingDownloader to take a look?

janhoy commented 7 years ago

The ApacheChecksumVerifyingDownloader (link) does a variant of what you describe, only it downloads the checksum from a trusted site, while the binary artifact is downloaded from a 3rd party mirror site close to the user (to limit bandwidth on Apache servers). So that use case is easy to fulfill with current FileDownloader.

The usecase I describe in this issue is not to download a separate md5 file from the download site (that defeats the purpose if you don't trust that server), but to commit a static md5sum string into the plugins.json PluginInfo itself, so that once it is in there it is a guarantee that the binary file, wherever it may be hosted, is the same as it was when a committer accepted the PR for the plugins.json entry.

decebals commented 7 years ago

I think that I understand what you say. I forget to add the question sign to the end of sentence so I will ask you again: For variant with DownloadOptions how I could scale (add new custom options or remove the options that don't make sense for my FileDownloader) ?

janhoy commented 7 years ago

For variant with DownloadOptions how I could scale (add new custom options or remove the options that don't make sense for my FileDownloader) ?

I see your concern, that class would perhaps be too closed down for future use cases such as SCP download needing a ssh-key etc. Perhaps a Java Properties object is better, and document what properties that the framework uses such as md5sum.

Aside: Of course, to be 100% sure that the binaries are from a certain author you should verify signatures. For all Apache artifacts there's an .asc file as well with a PGP signature. I plan to integrate a signature check as well into my ApacheMirrorsUpdateRepository.

janhoy commented 6 years ago

See #29 for a better solution. I chose sha512 instead of md5 since md5 is now commonly viewed as insecure as a means of validating that the content of a file has not changed.