Closed hoelzer closed 1 year ago
new container for medaka is already up. but we currently testing how new flowcell and new kit perform with artic
Ah nice and good to know. Is there a branch already that has the new container? We have fresh R10.4.1 runs since end of last week and would like to calculate them start of next week. And we can also run with an old model bc I would like to see the diff and impact of the models
@DataSpott ?
Yeah @DataSpott any news : ) Just in case there is already a new container in some branch we could also test it here w/ R10.4.1 data that was recently produced.
Sry, but I'm still working with the old R9-Flowcells and therefore old Medaka model. Did not yet dive into the next generation;)
Ok, I think people are switching already to R10.4.1 also bc R9 will be discontinued. What we would need to do is update medaka
within the ARTIC container first (https://github.com/nanozoo/bx_artic) @replikation - maybe it's simple, possibly crashing ;)
And if we have updated medaka
in that container, it should be relatively simple to test it w/ some runs and different models. What do you think?
And maybe we should also switch to a stable release of the ARTIC pipeline https://github.com/artic-network/fieldbioinformatics/releases/tag/v1.2.3 instead of the 1.3.0-dev
branch? But I remember that you had a reason to use the dev branch...
Ahh, and in v1.2.3 there is defined in the conda env:
- medaka >=1.6.1
so this might already solve the problem.
Just to keep you posted, I generated a new container for the ARTIC pipeline using v1.2.3 of their pipeline which installs medaka v1.6.1. Installing an even newer medaka version (1.7.2) does not work bc it screams for tensorflow 2.7.x which I was not able to solve.
But: medaka 1.6.1 at least has new r1041 models.
docker pull nanozoo/artic:v1.2.3--5d4390f
I would do now some further testing, adding the new container to a branch in poreCov and then do see if it runs through... because there must have been also a reason you used the 1.3.0-dev branch of the ARTIC pipeline... but maybe that's now also solved in their release v1.2.3
Maybe it's also possible to have a v1.3.0-dev container w/ medaka 1.6.1 - I will test that as well.
Okay, I can run the pipeline w/ nanozoo/artic:v1.2.3--5d4390f
but then the medaka step fails bc/ the v1.2.3 does not have the --min_depth
parameter we use here:
https://github.com/replikation/poreCov/blob/master/workflows/process/artic.nf#L21
The 1.3.0-dev branch of ARTIC has that.
Now I will try: still using 1.3.0-dev but installing the environment from v1.2.3 w/ medaka v1.6.1
Christian also hinted at a missing /opt/conda/bin
in the containers' PATH. I will also try using as a template the container I once successfully build w/ ARTIC v1.3.0-dev and medaka 1.7.2
Template is nanozoo/artic:1.3.0-dev--9bca1ff
Okay, I can run the pipeline w/
nanozoo/artic:v1.2.3--5d4390f
but then the medaka step fails bc/ the v1.2.3 does not have the--min_depth
parameter we use here:https://github.com/replikation/poreCov/blob/master/workflows/process/artic.nf#L21
The 1.3.0-dev branch of ARTIC has that.
Now I will try: still using 1.3.0-dev but installing the environment from v1.2.3 w/ medaka v1.6.1
this failed in the RUN cd fieldbioinformatics && python setup.py install
step
Christian also hinted at a missing
/opt/conda/bin
in the containers' PATH. I will also try using as a template the container I once successfully build w/ ARTIC v1.3.0-dev and medaka 1.7.2Template is
nanozoo/artic:1.3.0-dev--9bca1ff
Alright, this was pain in the butt. But now I have
docker pull nanozoo/artic:1.3.0-dev--a15e2ee
which has ARTIC pipeline 1.3.0-dev (bc/ we use the --min_depth
param...) and medaka
1.7.2 w/ all currently available models.
I think it looks fine, will try the whole pipeline tomorrow.
❯ docker run --rm nanozoo/artic:1.3.0-dev--a15e2ee artic minion --help | grep min_depth
[--max-haplotypes max_haplotypes] [--min-depth min_depth]
--min-depth min_depth
❯ docker run --rm nanozoo/artic:1.3.0-dev--a15e2ee medaka --version
medaka 1.7.2
Currently we use the container
nanozoo-artic-1.3.0-dev--2c5b6a9
which has Medaka v1.5.0 installed. Unfortunately, the new models for R10.4.1 flow cells that people start to use are not part of that. E.g.r1041_e82_400bps_sup_g615
Can we easily update medaka as part of the Artic workflow? Maybe it's also enough to download the most recent models via re-building the container (I wrote a script for that which is already part of the containers repository). However, last time I tried to update Medaka as part of the Artic workflow I failed. Maybe you find a way to update the container @replikation ?