Closed TBradley27 closed 1 week ago
Hi @TBradley27 thank you for spotting this, indeed the model is missing in the schema. I will provide a patch in the upcoming release. Thanks for notifying this.
In the meanwhile, you can make this work by adding the missing model to the schema at line 175: https://github.com/epi2me-labs/wf-somatic-variation/blob/master/nextflow_schema.json#L175
Hi @RenzoTale88,
Many thanks for this. I tried implementing the change you suggested. I manually downloaded the schema, edited it, and then tried to reconfigure the pipeline to use this updated schema using -params-file nextflow_schema.json
However, this didn't seem to fix the problem - and I returned the same error. Perhaps I am not implementing the new schema correctly?
Silly me.
I created a fork and added the changes you suggested, and with this strategy I got the modified pipeline running
I think it would be easiest for me to add this here rather than open a separate issue:
It seems that release v0.1.7 of ClairS had a bug which meant that it was not yet fully compatible with HAC models. See: https://github.com/HKU-BAL/ClairS/commit/c464c98c61f594489e385b8bb67c32518dce59f8
As wf-somatic-variation does not run the latest version of ClairS correcting this bug, this bug also affects wf-somatic-variation, and means in it's current form the pipeline is not compatible with newer HAC models:
@TBradley27 thank you for finding this out!
Hello,
I noticed in the latest version of the pipeline that wf-somatic-variation is still using an out-of-date version of clairS.
Isn't resolving this problem simply just a case of upgrading the clairS step in the pipeline to use a more up-to-date version of clairS?
The problem is that we can't run our HAC basecalled data on this release of the pipeline (nor could we with the previous two releases)
Hi @TBradley27, the updated version included few new options for both ClairS and the underlying Clair3 calls, which in turn required more testings to ensure that the results are consistent with the official release of ClairS. It will be included in the next release of the workflow, apologies for the delays.
Hi @RenzoTale88,
Thank you for taking your time to look into this.
However, I think one or two things are being confused here:
I acknowledge that the newest release of ClairS (v0.2.0) may require some additional testing. In my mind, this is a separate problem, and is not related to this GitHub issue I have created here.
This issue is specifically related to the problem of making wf-somatic-variation compatible with version v0.1.7 of ClairS (and subsequent patch commit). When v0.1.7 was first released, there was a bug which meant that it only worked with data basecalled with SUP models. The developer(s) of ClairS recognised this problem and changed 2 lines of code (https://github.com/HKU-BAL/ClairS/commit/c464c98c61f594489e385b8bb67c32518dce59f8) in order to make their software compatible again with HAC basecalled data. This was the first commit of the ClairS project since the v0.1.7 release.
The current problem is that wf-somatic-variation is still stuck on the v0.1.7 release of ClairS - whereas the subsequent commit in the ClairS repo after v0.1.7 would fix our problem, and therefore allow us to run the snv module of wf-somatic-variation on our HAC basecalled data.
Trying to make wf-somatic-variation fully compatible with v0.2.0 of ClairS sounds like a lengthy process, so would it not be easier to fix the ClairS dependency to the ClairS commit immediately after v0.1.7 (i.e. https://github.com/HKU-BAL/ClairS/commit/c464c98c61f594489e385b8bb67c32518dce59f8)?
@TBradley27 can you try with v1.2.2? That comes with ClairS v0.2.0, and should support he HAC models without issues.
@RenzoTale88 - Thank you very much! I will take a look either today or tomorrow
Hi @RenzoTale88,
I managed to run the snv module and mod modules successfully for the update. I had to rerun the pipeline twice due to bus errors . I had to manually respecify memory resources for a number of processes to get the pipeline to work.
I won't go into detail about these problems here - as a new issue would probably be more appropriate for that
A quick note: A lot of sys admins will (reasonably) configure SLURM to terminate jobs which uses more memory than what is requested. So in SLURM executor mode, the pipeline will fail quite a few times until users manually configure the pipeline so that each process has enough memory
Otherwise, thank you very much for the new release!
Operating System
Other Linux (please specify below)
Other Linux
Rocky Linux v8.6
Workflow Version
v1.1.0-g5851e21
Workflow Execution
EPI2ME command line
EPI2ME Version
No response
CLI command run
No response
Workflow Execution - CLI Execution Profile
singularity
What happened?
The workflow rejected use of our HAC model (dna_r10.4.1_e8.2_400bps_sup@v4.2.0).
This is confusing as the following release notes indicate that HAC baseballing models are acceptable:
https://github.com/epi2me-labs/wf-somatic-variation/releases/tag/v1.1.0
https://github.com/HKU-BAL/ClairS/releases/tag/v0.1.7
Relevant log output
Application activity log entry
No response
Were you able to successfully run the latest version of the workflow with the demo data?
other (please describe below)
Other demo data information
No response