Closed AzlanNI closed 2 years ago
I've not encountered this error on installation. Is this the entire error message? If not would you mind sending the complete error message? Could you also share your OS and python versions and the full command submitted?
To venture a guess though, this does appear to be a system compiler issue. icc
is the Intel compiler which python is identifying at the C compiler configured to use on this system. Remora requires portions of the code to be compiled and thus requires a valid compiler. You may have some luck setting the compiler at Remora install time (e.g. CC=/path/to/gcc pip install ont-remora
).
Hallo @marcus1487
The OS on our HPC Cluster is Linux and i am using Python 3.8.3.
I've used a PyPi Mirror since we don't have access to the internet from the HPC to download pip by using following command:
PIP_CONFIG_FILE=/software/python/pip.conf pip install --user ont-bonito
So one solution could be to load the new compiler like intel/xe2020.4 and try the pip install command again ?
Thanks 4 ur help!
Hi @AzlanNI
Is there a particular reason you want to use bonito for basecalling? You may wish to look at the production Guppy basecaller which implements a near identical algorithm to that used in bonito (a slightly earlier version of the remora algorithm).
I am using these basecaller for the detection of modified DNA Bases in a CpG context of cfDNA. I just saw a presentation from ONT in which they showed that the remora models are better in detecting modified bases since they don't sacrifice basecalling accuracy for canonical bases. This is the main reason i wanted to use megalodon or bonito basecaller to use the remora models.
Can you provide details of that presentation, it sounds like it needs updating. It is no longer the case with Guppy that asking it to perform modified base calling of CpG will lead to lower canonical base accuracy: Guppy has used the Remora algorithm since v6.1.1, https://community.nanoporetech.com/downloads/guppy/release_notes.
I watched the London Calling 2022: Update from Oxford Nanopore Technologies in which i understood that using the remora models increases the accuracy of the modified basecallings. But maybe i understood it wrong that Remora would be the best option for modified basecalling if both of them are equal in strength and accuracy then Guppy would be a better choice since we are using Guppy 5.0.7 currently on the HPC. But the Version is kinda outdated maybe we should update to the newest Version.
Did i mix up stuff with Guppy and the remora models ? since my Bonito basecaller still is not working sadly on the HPC Cluster.
I also tried the megalodon basecaller but there i always get the Error: RROR: Guppy version string does not match expected pattern: "b'Intel MKL FATAL ERROR: Cannot load /software/guppy/5.0.7/cpu/bin/guppy_basecall_server.\n'"
I think this could also be cause i am utyring to use the newest version of megalodon 2.5 abd Guppy version 5.0.7 .
The Remora algorithms are now the backend for all modified base calling across the different basecaller implementations (megalodon/bonito/guppy). Megalodon and Bonito directly use the implementations from Remora python package, but these may be less stable as these are research demonstrators. The implementation in Guppy is the recommendation, but newer features may lag behind the research basecallers. The next version of Guppy will add support for version 1 Remora models (higher accuracy with a signal re-scaling stage).
Note that Guppy > 6.1 is required for running Remora models within Guppy.
Alright. I got it! If Guppy is the recommendation for modified basecalling then maybe we should just update the Guppy Version on the HPC. As already said the premiss of using bonito and megalodon was to use remora models. Since we taught that Guppy ist sacrificing canonical basepair accuracy. But can u currently use Remora models in Guppy version > 6.1 ?
Thanks a lot for the information and help!
But can u currently use Remora models in Guppy version > 6.1 ?
Correct, if you update Guppy and then run with the the configuration dna_r9.4.1_450bps_modbases_5mc_cg_sup_prom.cfg
(or similar) and use the --bam_out
and --align_ref
, guppy will output BAM data of aligned reads annotated tags for modified bases defined in the SAM specification.
alright i will try using the remora models on Guppy ASAP. Is there a command to see the remora Models which are accessible by Guppy 6.1.7 ?
We now have Guppy 6.1.7 installed on the HPC and i wanted to test some remora model usage to detect modified basecalling. Is there a listing of custom tags for the models oder a list in which i could see which model would be the best matching. By using Guppy_basecaller --print_workflow i dont see any modbases models
I found the reference to dna_r9.4.1_450bps_modbases_5mc_cg_sup_prom.cfg
simply by digging around in the data directory of the guppy installation. I'm not sure how one is supposed to do this but here is a listing of all the configuration files:
dna_r10.4_e8.1_modbases_5hmc_5mc_cg_fast.cfg
dna_r10.4_e8.1_modbases_5hmc_5mc_cg_fast_prom.cfg
dna_r10.4_e8.1_modbases_5hmc_5mc_cg_hac.cfg
dna_r10.4_e8.1_modbases_5hmc_5mc_cg_hac_prom.cfg
dna_r10.4_e8.1_modbases_5hmc_5mc_cg_sup.cfg
dna_r10.4_e8.1_modbases_5mc_cg_fast.cfg
dna_r10.4_e8.1_modbases_5mc_cg_fast_prom.cfg
dna_r10.4_e8.1_modbases_5mc_cg_hac.cfg
dna_r10.4_e8.1_modbases_5mc_cg_hac_prom.cfg
dna_r10.4_e8.1_modbases_5mc_cg_sup.cfg
dna_r9.4.1_450bps_modbases_5hmc_5mc_cg_fast.cfg
dna_r9.4.1_450bps_modbases_5hmc_5mc_cg_fast_prom.cfg
dna_r9.4.1_450bps_modbases_5hmc_5mc_cg_hac.cfg
dna_r9.4.1_450bps_modbases_5hmc_5mc_cg_hac_prom.cfg
dna_r9.4.1_450bps_modbases_5hmc_5mc_cg_sup.cfg
dna_r9.4.1_450bps_modbases_5hmc_5mc_cg_sup_prom.cfg
dna_r9.4.1_450bps_modbases_5mc_cg_fast.cfg
dna_r9.4.1_450bps_modbases_5mc_cg_fast_prom.cfg
dna_r9.4.1_450bps_modbases_5mc_cg_hac.cfg
dna_r9.4.1_450bps_modbases_5mc_cg_hac_prom.cfg
dna_r9.4.1_450bps_modbases_5mc_cg_sup.cfg
dna_r9.4.1_450bps_modbases_5mc_cg_sup_prom.cfg
dna_r9.4.1_e8.1_modbases_5mc_cg_fast.cfg
dna_r9.4.1_e8.1_modbases_5mc_cg_fast_prom.cfg
dna_r9.4.1_e8.1_modbases_5mc_cg_hac.cfg
dna_r9.4.1_e8.1_modbases_5mc_cg_hac_prom.cfg
dna_r9.4.1_e8.1_modbases_5mc_cg_sup.cfg
The only ones likely of interest to you are the dna_r9.4.1...
ones. The others are not widely released chemistries.
Great Thanks! I just tried to find something to list them up. Can u tell me if there is a documentation which shows what the custom tags mean e.g. hac mean High accuracy. So what means prom or sup ? Thanks for ur help!
fast: fast basecaller hac: high accuracy basecaller sup: super accuracy basecaller prom: promethion (lack of) prom: MinION/GridION
The Guppy user guide can be found in the Nanopore community: https://community.nanoporetech.com/docs/prepare/library_prep_protocols/Guppy-protocol/v/gpb_2003_v1_revae_14dec2018
Hello Everyone,
I am currently trying to get remora and the Basecaller Bonito on our HPC. I am using the pip install command but i always get the Error :
note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for ont-remora Failed to build ont-remora ERROR: Could not build wheels for ont-remora, which is required to install pyproject.toml-based projects
Maybe this is a known issue or someone can help me out. I am using a PyPi mirror currently since the HPC has no net connection.
I would appreciate any help!
kind regards,
Azlan