cbg-ethz / shorah

Repo for the software suite ShoRAH (Short Reads Assembly into Haplotypes)
GNU General Public License v3.0
41 stars 14 forks source link

Is it possible to use the error correction step of ShoRAH as a standalone process? #24

Closed greenstick closed 7 years ago

greenstick commented 7 years ago

I'm interested in using only the error correction procedures in ShoRAH for an analysis. To do this, I would need to input either FASTQ or BAMs into ShoRAH's error correction algorithm, and receive a similar output with corrected reads. If a simple transformation is required I could code one up.

Looking at the source and documentation, it doesn't look like this is possible without forking and writing a use case specific implementation. Before going down that road, I thought it best to ask: Is there an established way to run ShoRAH's error correction as a standalone process?

Thanks

ozagordi commented 7 years ago

Hi. I would say you need to look into amplian.py for directions. It makes some preprocessing and then calls diri_sampler for the error correction. diri_sampler takes a multiple alignment in fasta format and runs the clustering. The output is split over several files, but you are probably mostly interested in the file *cor.fas, that contains the haplotypes in fasta format. Does this help?

greenstick commented 7 years ago

I think so, I'll take a close look. Thank you!