nanoporetech / medaka

Sequence correction provided by ONT Research
https://nanoporetech.com
Other
391 stars 73 forks source link

model for Guppy 6.3.8 #407

Closed Lobna-H closed 1 year ago

Lobna-H commented 1 year ago

Could you please advise on which model should be used for medaka for polishing a long read sequenced genome recovered by using flow cell FLO-MIN114, Guppy version 6.3.8, and super accurate basecalling? Is it r103_min_high_g360?

I ran the command medaka tools list_models

r103_min_high_g345, r103_min_high_g360, r103_prom_high_g360, r103_prom_snp_g3210, r103_prom_variant_g3210, r10_min_high_g303, r10_min_high_g340, r941_min_fast_g303, r941_min_high_g303, r941_min_high_g330, r941_min_high_g340_rle, r941_min_high_g344, r941_min_high_g351, r941_min_high_g360, r941_prom_fast_g303, r941_prom_high_g303, r941_prom_high_g330, r941_prom_high_g344, r941_prom_high_g360, r941_prom_snp_g303, r941_prom_snp_g322, r941_prom_snp_g360, r941_prom_variant_g303, r941_prom_variant_g322, r941_prom_variant_g360 Default consensus: r941_min_high_g360 Default snp: r941_prom_snp_g360 Default variant: r941_prom_variant_g360

cjw85 commented 1 year ago

The current release of medaka (v1.7.2) contains the models r1041_e82_400bps_sup_g615

It looks to me like you are using an older version of medaka.

Lobna-H commented 1 year ago

Hi again Thanks. I have tried to update medaka using conda on mac, and I have realised that the latest version is v1.0.3. Then I tried to download it using pip with no success. could you please help with the command I should use to download the latest version

pip install medaka

Collecting medaka Using cached medaka-1.7.2.tar.gz (40.8 MB) Installing build dependencies ... done Getting requirements to build wheel ... error error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> [4 lines of output] Cannot import parasail, some features may not be available. Cannot import spoa, some features may not be available. Bundling models: ['r103_fast_g507', 'r103_fast_snp_g507', 'r103_fast_variant_g507', 'r103_hac_g507', 'r103_hac_snp_g507', 'r103_hac_variant_g507', 'r103_min_high_g345', 'r103_min_high_g360', 'r103_prom_high_g360', 'r103_prom_snp_g3210', 'r103_prom_variant_g3210', 'r103_sup_g507', 'r103_sup_snp_g507', 'r103_sup_variant_g507', 'r1041_e82_260bps_fast_g632', 'r1041_e82_260bps_fast_variant_g632', 'r1041_e82_260bps_hac_g632', 'r1041_e82_260bps_hac_variant_g632', 'r1041_e82_260bps_sup_g632', 'r1041_e82_260bps_sup_variant_g632', 'r1041_e82_400bps_fast_g615', 'r1041_e82_400bps_fast_g632', 'r1041_e82_400bps_fast_variant_g615', 'r1041_e82_400bps_fast_variant_g632', 'r1041_e82_400bps_hac_g615', 'r1041_e82_400bps_hac_g632', 'r1041_e82_400bps_hac_variant_g615', 'r1041_e82_400bps_hac_variant_g632', 'r1041_e82_400bps_sup_g615', 'r1041_e82_400bps_sup_variant_g615', 'r104_e81_fast_g5015', 'r104_e81_fast_variant_g5015', 'r104_e81_hac_g5015', 'r104_e81_hac_variant_g5015', 'r104_e81_sup_g5015', 'r104_e81_sup_g610', 'r104_e81_sup_variant_g610', 'r10_min_high_g303', 'r10_min_high_g340', 'r941_e81_fast_g514', 'r941_e81_fast_variant_g514', 'r941_e81_hac_g514', 'r941_e81_hac_variant_g514', 'r941_e81_sup_g514', 'r941_e81_sup_variant_g514', 'r941_min_fast_g303', 'r941_min_fast_g507', 'r941_min_fast_snp_g507', 'r941_min_fast_variant_g507', 'r941_min_hac_g507', 'r941_min_hac_snp_g507', 'r941_min_hac_variant_g507', 'r941_min_high_g303', 'r941_min_high_g330', 'r941_min_high_g340_rle', 'r941_min_high_g344', 'r941_min_high_g351', 'r941_min_high_g360', 'r941_min_sup_g507', 'r941_min_sup_snp_g507', 'r941_min_sup_variant_g507', 'r941_prom_fast_g303', 'r941_prom_fast_g507', 'r941_prom_fast_snp_g507', 'r941_prom_fast_variant_g507', 'r941_prom_hac_g507', 'r941_prom_hac_snp_g507', 'r941_prom_hac_variant_g507', 'r941_prom_high_g303', 'r941_prom_high_g330', 'r941_prom_high_g344', 'r941_prom_high_g360', 'r941_prom_high_g4011', 'r941_prom_snp_g303', 'r941_prom_snp_g322', 'r941_prom_snp_g360', 'r941_prom_sup_g507', 'r941_prom_sup_snp_g507', 'r941_prom_sup_variant_g507', 'r941_prom_variant_g303', 'r941_prom_variant_g322', 'r941_prom_variant_g360', 'r941_sup_plant_g610', 'r941_sup_plant_variant_g610'] error in medaka setup command: 'python_requires' must be a string containing valid version specifiers; Invalid specifier: '>3.5.*' [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

cjw85 commented 1 year ago

Hi @Lobna-H,

The issue you are seeing is the same as this one: https://github.com/nanoporetech/medaka/issues/412

Medaka is in need of a bit of love. We will look to make a new release in the coming weeks.

cjw85 commented 1 year ago

This should now be fixed in v1.7.3.

aistBMRG commented 1 year ago

Small question.

I am basecalling using guppy/6.4.6 with dna_r10.4.1_e8.2_400bps_sup.cfg

Would it be possible to clarify the appropriate medaka model for polishing Flye bacterial genome assemblies?

Seems that medaka/1.7.2 has r1041_e82_400bps_sup_g615_model.tar.gz while medaka/1.7.3 has r1041_e82_400bps_sup_v4.0.0_model.tar.gz.

Thanks for the input.

Regards,

Dieter

cjw85 commented 1 year ago

This is the correct model. See the README.