nextflow-io / nf-schema

Functionality for working with pipeline and sample sheet schema files in Nextflow pipelines
https://nextflow-io.github.io/nf-schema/
Apache License 2.0
12 stars 21 forks source link

validateParameters failed on schema_input.json when running nf-core pipelines in offline-mode #18

Open asp8200 opened 1 year ago

asp8200 commented 1 year ago

At DNGC, we encountered the following problem when trying to run nf-core pipelines from nf-tower on a HPC with no internet:

Sep-26 12:44:13.682 [main] ERROR nextflow.cli.Launcher - @unknown
org.everit.json.schema.SchemaException: #: could not determine version
        at org.everit.json.schema.loader.SchemaLoader.<init>(SchemaLoader.java:293)
        at org.everit.json.schema.loader.SchemaLoader$SchemaLoaderBuilder.build(SchemaLoader.java:143)
        at org.everit.json.schema.loader.SchemaLoader.load(SchemaLoader.java:262)
        at org.everit.json.schema.loader.SchemaLoader.load(SchemaLoader.java:246)
        at org.everit.json.schema.loader.SchemaLoader$load.call(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:139)
        at NfcoreSchema.validateParameters(NfcoreSchema.groovy:146)
        at NfcoreSchema.validateParameters(NfcoreSchema.groovy)
        at NfcoreSchema$validateParameters.call(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:157)
        at WorkflowMain.initialise(WorkflowMain.groovy:57)
        at WorkflowMain$initialise.call(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:157)
        at Script_943fdc95.runScript(Script_943fdc95:30)
        at nextflow.script.BaseScript.runDsl2(BaseScript.groovy:170)
        at nextflow.script.BaseScript.run(BaseScript.groovy:217)
        at nextflow.script.ScriptParser.runScript(ScriptParser.groovy:230)
        at nextflow.script.ScriptRunner.run(ScriptRunner.groovy:225)
        at nextflow.script.ScriptRunner.execute(ScriptRunner.groovy:131)
        at nextflow.cli.CmdRun.run(CmdRun.groovy:354)
        at nextflow.cli.Launcher.run(Launcher.groovy:487)
        at nextflow.cli.Launcher.main(Launcher.groovy:646)

(TO-DO: Get nf version number from Martin.)

When disabling validateParameters() (for instance, by setting params.validate_params=true) there is no problem.

This issue might be related to https://github.com/nextflow-io/nf-schema/issues/19.

@maxulysse was informed about this issue.

This issue was uncovered during our work on https://support.seqera.io/support/tickets/3811

nvnieuwk commented 1 year ago

Hi, can you show me what the schema looks like? I may have a vague idea but need to confirm it first. (Also I'm pretty swamped ATM so I probably won't solve it before the hackathon, sorry for that)

asp8200 commented 1 year ago

I understand, Nicolas. No worries. Solving this issue doesn't seem super urgent or critical to me. Martin showed me the problem for both sarek and hlatyping. (I suspect that it is a problem with the nf-core pipelines that use nf-validation.) I believe Martin was using unaltered versions of assets/schema_input.json.

In addition, I believe that Martin was able to reproduce the err msg on our dev-HPC (with internet) by setting "$schema": "" in assets/schema_input.json. I'll try to get Martin to supply further details on this issue.

martinfthomsen commented 1 year ago

Hi,

Here is the history of our setup and tests:

  1. We have the pipelines running on our development environment with internet. The pipelines runs fine using standard command line calls on the compute environment. And since there is internet, the pipelines also runs fine from our NF-Tower server.
  2. We have the pipelines running on our production HPC system where there is no internet connection. The pipelines runs fine using standard command line calls on the compute environment.
  3. The pipeline source code is identical on our dev env and prod env.
  4. The "could not determine version" error occurs when trying to run the pipeline from our Tower server on the production HPC environment.
  5. I have been able to reproduce the issue by modifying the second line in the nextflow_schema.json ( "$schema": "http://json-schema.org/draft-07/schema",) to point to something else.
  6. The error is reproduced on both through Tower on the Dev env and through command line calls on HPC
  7. It does not matter what the "$schema" value is changed to. I have tried empty string, pointing to a local copy of the draft-07/schema file on the compute environment and pointing to a copy of the draft-07/schema file hosted on an internal http service. None of these attempts works.
  8. The log message Anders have included above is from running the pipeline command line on the HPC env, where the "$schema" value was changed.
  9. Setting the param validate_params to false fixes the issue, and the pipeline runs fine from Tower and in all environments.

As we now have a workaround that makes it possible to run the pipelines from Tower in our production environment, this is not a critical issue. But we are assuming the validation of the parameters are useful to the users, so it would be nice if the issue could be fixed at some point, so that we may reintroduce the params validation feature for our users. 🙂

nvnieuwk commented 1 year ago

Hi @martinfthomsen thanks for this full description, you confirmed my suspicions. The problem lies with the library we use for the JSON schema. I'll see if I can find a fix from their side.

nvnieuwk commented 9 months ago

Hi so we moved the JSON schema parsing to another library in the last update. Can you retry with version 2.0.0 (mind that this version contains breaking changes that need some updates to the JSON schema! See here)

nvnieuwk commented 7 months ago

Hi can you try this again using nf-schema? :)

nvnieuwk commented 1 month ago

Any news on this? :)

asp8200 commented 1 month ago

Any news on this? :)

Hi @nvnieuwk 👋 I'm afraid not. I don't use nf-Tower, and @martinfthomsen is on leave.

nvnieuwk commented 1 month ago

Hi, no problem I'll leave this open for a bit longer then :)