Open varun8476 opened 1 year ago
This hasn't been tried. At the time ARIBA was made, long read assemblies were too low quality (in particular, indel errors), which would have led to too many errors. ARIBA is made to be quite conservative, which is fine for Illumina but not for data with a higher error rate.
But I'm not sure it's worth trying because these days long reads and their assemblies are significantly better now (although I'd still be wary of indel errors). If it was me, I would assemble all the reads (using flye/unicycler/whatever works) and then use arbitamr for the amr predictions: https://github.com/MDU-PHL/abritamr
Sorry if that sounds too negative, but realistically I expect that would be the best method. Happy to be proven wrong! That said, if you really want to do it then this is what I can think of that will need changing, and there's probably more that I haven't thought of. Basically, there's a bunch of places where read pairs are assumed, and it'll be a fair bit of work to deal with going from paired to unpaired:
Hi, Can we make ARIBA compatible with long reads by changing the mapping and assembly approach? I am planning to do this as my masters thesis project. I am a bioinformatics student and my first hunch is to use minimap2 for mapping the reads to the cluster and using any long read assembler such as Flye or Miniasm for assembling the reads.
Any leads as to whether this approach is feasible or pointing out any research done related to this would be helpful. Thanks in advance.