Open hd2326 opened 2 years ago
Hello @hd2326 you can give bonito your custom trained remora onnx model with bonito basecaller --modified-base-model custom.onnx
(it doesn't need converting).
@iiSeymour Thank you very much for the quick reply!
So the mod-calling will be a two-step process, 1) using bonito tar+toml for basealling, based on which 2) using remora onnx to make mod-calling, right?
You need both models (a bonito basecalling model [tar+toml] and a remora modbase [onnx] model) but it's one command -
bonito basecaller dna_r10.4.1_e8.2_hac@v3.5.1 /data/reads --modified-base-model custom.onnx > calls.bam
Got it! Thank you so much for the explanation!
Greetings!
As I am running bonito as @iiSeymour suggested:
bonito basecaller $bonito_models/dna_r9.4.1_e8_hac@v3.3/ $rawdata --modified-base-model $remora_model/model_best.onnx --modified-bases $mod --reference $genome
I got the following error:
remora.RemoraError: No trained Remora models for /bonito_models/dna_r9.4.1_e8. Options: dna_r9.4.1_e8, dna_r9.4.1_e8.1, dna_r10.4_e8.1
It seems that the remora model I provided cannot be recognized. Any insights on the issue? Thank you very much!
The --modified-bases
argument triggers bonito to lookup the corresponding remora model. But it appears that you have also specified the path to a remora model with the --modified-base-model
argument which specified the modified bases to call. Removing the --modified-bases
argument from the call should work.
Awesome! As @marcus1487 suggested, removing --modified-bases
solves the problem!
But another problem came...
It seems that the bonito bam files are not compatible with samtools mpileup
for modification analysis. I got the samtools mpileup: error reading from input file
error, but I don't have this problem for guppy bam files.
Specifically, the modification I am trying to analysis is uracil (I named it x and 5xT) in DNA, and I trained the bonito model using the following workflow:
taiyaki prepare_mapped_reads.py
with --mod x T 5xT
to generate the hdf5 file.remora dataset prepare
with --motif T 0
to convert the hdf5 file to the npz file.remora model train
with the provided ConvLSTM_w_ref.py
model template.The MM tag I got in bonito bam files is like MM:Z:['T']+x,-1,...,-1;
. As for guppy bam files I got something like Mm:Z:C+m,0,...,0;
. Not sure what does the negative MM value mean, and maybe that causes the incompatible problem? Any insights on the issue? Thank you very much!
Hi @hd2326,
I was wondering if you ended up getting bonito to basecall Us?
best, S
@najohink Actually no. Still the same error...
Greetings!
I have a trained onnx remora model, and I am wondering whether would it be possible to convert it to the tar+toml format for bonito mod-calling. Thank you very much in advance for your help!