nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.61k stars 606 forks source link

Inspect command improvement for Singularity #4999

Open pditommaso opened 1 month ago

pditommaso commented 1 month ago

The inspect command reports the container used during a pipeline execution, for example:

» nextflow inspect . -profile wave
{
    "processes": [
        {
            "name": "RNASEQ:INDEX",
            "container": "community.wave.seqera.io/library/salmon:1.10.2--fd38303983e26319"
        },
        {
            "name": "RNASEQ:QUANT",
            "container": "community.wave.seqera.io/library/salmon:1.10.2--fd38303983e26319"
        },
        {
            "name": "MULTIQC",
            "container": "community.wave.seqera.io/library/multiqc:1.17--aa7f4f58d1a63e37"
        },
        {
            "name": "RNASEQ:FASTQC",
            "container": "community.wave.seqera.io/library/fastqc:0.12.1--0d48a4f85934912a"
        }
    ]
}

The inspect command is enable to detected the container engine and reflect the corresponding container names. For example using Sngularity, the images SIF files are uploaded to the registry and the oras:// scheme is used

{
    "processes": [
        {
            "name": "RNASEQ:INDEX",
            "container": "docker://community.wave.seqera.io/library/salmon:1.10.2--fd38303983e26319"
        },
        {
            "name": "RNASEQ:QUANT",
            "container": "docker://community.wave.seqera.io/library/salmon:1.10.2--fd38303983e26319"
        },
        {
            "name": "MULTIQC",
            "container": "docker://community.wave.seqera.io/library/multiqc:1.17--aa7f4f58d1a63e37"
        },
        {
            "name": "RNASEQ:FASTQC",
            "container": "docker://community.wave.seqera.io/library/fastqc:0.12.1--0d48a4f85934912a"
        }
    ]
}

However, there are use cases in which users prefer to use plain HTTP protocol to download the SIF image instead of the oras pseudo protocol.

The goal of this issue is to add at a new option to the inspect command, so the the HTTP URL of the SIF tarbal is retuned in place of oras URL.

Implementation

This capability can be implemented using the Wave inspect API for each container images returned by the current implementation. Once acquired the inspect payload it should be inferred the HTTP URL of the layer tarbal (note a singularity image only contained *one* tarbal that corresponds to the SIF image)

An example can be found here and here

bentsherman commented 1 month ago

cc @marcodelapierre

pditommaso commented 1 month ago

I gave a try to this in this PR https://github.com/nextflow-io/nextflow/pull/5000, however I;m not sure anymore this should be a inspect specific capability, if it should be a general Nextflow setting to force the use of http prefixed image names instead of oras ones

marcodelapierre commented 1 month ago

I had a look today.

Just as naïve reference, here is what ORAS vs HTTPS look like for a Wave managed Ubuntu 22.04 image:

oras://quay.io/marcodelapierre/wavebuild:790c664642d24006

https://quay.io/v2/marcodelapierre/wavebuild/blobs/sha256:0d8a2ea4480852e210692320069bbbfb84fbc8c75e15be3256d491d7a04def9e

One important question is, why do pipeline devs/users would need this info, and which one is then more suitable? Some thoughts of mine.

  1. For use to statically reference the image inside a pipeline, I don't see a big difference. I mean, HTTPS is more concise and also more familiar, but in the end Nextflow behaves the same way with both, and users would copy/paste the string.
  2. For use to automatically pre-fetch the images for pipeline runs, I would say it depends on how those fetched images are then used. If the fetched images are only used for pipeline runs and handled by Nextflow, no big difference. If the fetched images are also exposed to end users for their own purposes, then HTTPS seems more useful (see manual fetching below).
  3. For use to manually pre-fetch the images, HTTPS seems more useful, because it is familiar to users, who know (or can quickly google-learn) that wget is enough to download them. (only downside: sha256 is more verbose than the tag). Here I am arguing that the ORAS format is somehow more obscure to the average pipeline dev/user, and as such we are helping them out by sparing them from learning how to handle it. (Side question: is there community request for this case?)
  4. For use in manual test/dev of the pipeline: HTTPS seems more useful, for the same arguments as for manual pre-fetching above.

In practice, I would call for a simple way for devs/users to get the HTTPS images for manual fetching/testing/dev of pipelines. As such,

I opened a bit the conversation, to gauge a bit your thoughts @pditommaso @bentsherman @ewels

pditommaso commented 1 month ago

I believe the support for plain http can be useful to keep the compatibility with current behavior in which the SIF images is downloaded and stored into a shared cached.

Worth mentioning the ORAS is a merely hosting the same file in a OCI-registry, however when pulling via oras:// the singularity runtime pull it directly and store in the compute node.

When using the https protocol the pull is made by nextflow and stored in the path defined by the singularity cache dir variable.

I've made a tentative (draft) implementation here, adding a new setting named useHttpOverOras = true|false.

The problem that remains to address is the inspect command. Idally when enabling the useHttpOverOras=true as a user I'd like to see the http uris of the images, however, the inspect command submits a dry-run request, therefore the SIF images is not created and therefore the HTTP URI cannot be retrieved

marcodelapierre commented 1 month ago

So, the challenging case is the one of a pipeline profile with Wave containers enabled.

I understand the key missing info in the dryrun within inspect is the sha256 digest of the SIF layer within the container, is it correct?

So, the Wave client call from the inspect command should invoke the Packer class to get the digest of the layer, and then make up the full http-compatible URI .. would this be a suitable strategy?

Note: also, we should probably make it explicit that the container URIs printed by nextflow inspect are not checked for existence. This is always the case, e.g. if a process.container directive contains a typo, this will be reflected in the inpsect output.

marcodelapierre commented 1 month ago

@pditommaso while running tests for this issue and related PR, I have noticed that a Nextflow run requesting Wave Singularity containers triggers a Docker-native, not Singularity-native image.

Does this ring a bell?

This contrasts with both Wave CLI behaviour and what we detailed in the recent Singularity blog post.

As such, I have opened an issue to keep track: https://github.com/nextflow-io/nextflow/issues/5015

pditommaso commented 1 month ago

Likely the best solution is to require the concretise option to run this

marcodelapierre commented 1 month ago

Getting the following error from the call to jsonToInspectResponse:

ERROR ~ Failed making field 'java.time.Instant#seconds' accessible; either increase its visibility or write a custom TypeAdapter for its declaring type.

Going to look for a fix to this one to proceed.

bentsherman commented 1 month ago

I'm guessing it comes from this property in the container inspect response: https://github.com/seqeralabs/libseqera/blob/master/wave-api/src/main/java/io/seqera/wave/core/spec/ConfigSpec.java#L138

Gson uses reflection to deserialize the object from JSON and for some reason it's trying to reflect on a private field. ConfigSpec looks relatively new, not sure beyond that

marcodelapierre commented 1 month ago

Thanks Ben.

I am trying to understand how this InstantAdapter class might be useful to the cause.

marcodelapierre commented 1 month ago

Also been studying how the setup of the classes ContainerInspectResponse/ContainerSpec/ConfigSpec compare with respect to other similar classes with Instant attributes, i.e. SubmitContainerTokenResponse and BuildStatusResponse.

In the former trio of classes, are there attributes/methods that should be public?
Testing this hypothesis with no definitive answers so far.

@pditommaso any hints?

pditommaso commented 1 month ago

I think it's needed a specific type adapter for handling the Instant type. This may help

marcodelapierre commented 1 month ago

Thanks Paolo. I have fixed the issue with the Instant type, latest commit.

Shall the goal be to print a warning for those non yet built images , e.g. prompting to inspect with --concretize to build the images?

marcodelapierre commented 1 month ago

Let me articulate better.

If inspect is equipped to trigger the builds and wait for them, it will imply wait times of minutes to tens of minutes (depending on how many containers the pipeline uses). I don't think this is a good user experience.

A better experience would be for inspect to flag out those images that have not been built yet, so that users are informed promptly and can choose to launch the builds, before re-inspecting to get the HTTP URI.

How should the information be conveyed?

  1. current test implementation in latest commit (not convinced): WARNING inside the container field

        {
            "name": "RNASEQ:QUANT",
            "container": "https://community.wave.seqera.io/v2/library/salmon/blobs/sha256:77ad611127a53e757a5a65bdf1c0baa17f185a8d978503ac63d62347f234dbe7"
        },
        {
            "name": "MULTIQC",
            "container": "WARNING: image not found - oras://community.wave.seqera.io/library/multiqc:1.17--790fff013c3915f7"
        },
  2. should we add a built or available boolean field?

  3. others?

@pditommaso @ewels what do you think?

ewels commented 1 month ago

I don't think that we should mess with the container string. Additional fields such as status and/or build date could be useful 👍🏻

bentsherman commented 1 month ago

You could print the warning to stderr so that the user sees it without clobbering the JSON output

marcodelapierre commented 1 month ago

Good progress, inspect -concretize now builds the images.

One unit test for WaveClient yet to be fixed. Leaves me wondering if I should re-adapt the implementation.

To have concretize working, I am using a WaveConfig dryrun option that I then use in Container request (e.g. here):

-                dryRun: ContainerInspectMode.active()
+                dryRun: config.dryRun()

However, this is set correctly only when CmdInspect:checkWaveConfig is executed.

I am wondering whether I should instead use a modified ContainerInspectMode.active class to store the dryRun information, to retain the isolation of contexts as it was before this PR.

marcodelapierre commented 1 month ago

@pditommaso thoughts on the above? 👆

marcodelapierre commented 1 month ago

I have done a naive update to WaveClientTest, see https://github.com/nextflow-io/nextflow/pull/5000/commits/947908e80424292e61ba3c054917be2b36690182.

Happy to take alternate routes.

marcodelapierre commented 1 month ago

Ok, done a better implementation, by using a new ContainerInspectMode.waveDryRun attribute, see current status of PR https://github.com/nextflow-io/nextflow/pull/5000.

I believe it is ready for a review.

Maybe we can still improve on messages/behaviour when image is not built, but otherwise the key implementation changes are there.