Closed mgils4 closed 1 month ago
I thought at first hand that it might be due to me not using Racon before Medaka, but seeing as the process is almost finished before the error, I assumed that it might still work? For as far as I understand, the error is trying to process a string as an integer, and I would assume that that is not changed by preprocessing with Racon. Any clue as to what I might be doing wrong?
Is this issue being worked on @cjw85 ?
Medaka is tripping up over the fact that you're input scaffold sequences have names that look a bit like intermediate names it creates for itself for chunks of data
barcode_05_q10_SSI_RL_09_(NZ_CP039266.1:164815-166270)
The work around here would be to remove the NZ_CP039266.1:164815-166270
from the FASTA file you are providing to medaka.
Hi @cjw85, thanks for your response, though i do have a quick question. The removal of part of the header, that would be for the scaffold sequences, correct?
Correct
Hi @cjw85, that was indeed the issue, thanks for the solution. I do have a small follow-up question: if I want to polish multiple assemblies, would it be better to do this as seperate runs per assembly file, or is it possible to use a concatenated FASTA file with multiple assemblies?
It would be best done as separate runs.
Hi, I am encountering a bug when using Medaka, but I can't figure out how I could solve it. Would you happen to have an idea as to what I might be doing wrong. Thank you in advance.
Describe the bug I am trying to polish a consensus sequence created with Samtools consensus (from a Minimap2 alignment), but I keep getting errors whenever it reaches the stitching of the consensus chunks
The command used:
Logging Please attach any relevant logging messages. (Use ``` before and after code blocks).
Environment (if you do not have a GPU, write No GPU):
Additional context I am working with demultiplexed data as a starting point, with sequence data of about 1100-1200 bp long, with the end goal of creating a pipeline that creates viable consensus sequences from ONT data. I have also tried using a couple different models (the default model, the r1041_e82_400bps_fast_g615 model and the r1041_e82_400bps_fast_g632 as shown in the command above, but everytime I get the same error.
(PS: I am still quite new to bioinformatics, so appologies if I did something wrong/incorrect)