Koeng101 / dnadesign

A Go package for designing DNA.
Other
23 stars 0 forks source link

init minimap2 #46

Closed Koeng101 closed 8 months ago

Koeng101 commented 8 months ago

This PR adds minimap2 through os/exec (python equivalent of subprocess)

CamelCaseCam commented 8 months ago

Okay, so here are my thoughts:

CamelCaseCam commented 8 months ago

It looks totally clear, it would just frustrate me personally not to get the output as a string. Alternatively, it'd be helpful to write a sam parser. Also, I might totally be misunderstanding the language features here since I don't know go

Koeng101 commented 8 months ago

Okay, so here are my thoughts:

* The code was clear to me. I'm not 100% sure how documentation is generated, but I'm assuming it's generated from the comments in the files. If this is the case, the documentation is also clear.

* Design-wise, I don't love the function exclusively writing to an `io.writer` (I'm assuming this is similar to a C++ `std::stringstream`). You don't seem to have a sam file parser, so the fact that you can use it with `bio` parsers doesn't seem relevant because there aren't any that you'd use it with. I would add another alias for that function that returns the output of minimap2 as a string (just realized that go doesn't have overloads)

Autogenerated from code (see what docs look like here, vs the code)

It looks totally clear, it would just frustrate me personally not to get the output as a string. Alternatively, it'd be helpful to write a sam parser. Also, I might totally be misunderstanding the language features here since I don't know go

Just merged the sam parser into this tree :)

CamelCaseCam commented 8 months ago

Perfect - that solves it!

Koeng101 commented 8 months ago

It looks totally clear, it would just frustrate me personally not to get the output as a string

Yea that is a Go thing. The expectation is there that you will just write 2 more lines of code to get a string. Originally when we were writing parsers back in the day, we did everything as strings, but then moved onto the more idiomatic Go ways, and it unlocks a lot of efficiency.

// Create output buffer
var buf bytes.Buffer

// Execute the Minimap2 function
_ = minimap2.Minimap2(templateFile, fastqFile, &buf)
output := buf.String()

The expectation in the standard library for this kind of thing is that you work with io.Writer / io.Reader (for stream parsing and such), rather than writing everything into memory. IMO these kinds of interfaces are one of my favorite parts of Go, because pretty much everything implements those two from the standard library. Want to read from a torrent and write to an http connection? It's literally same types as reading and writing from files.

Koeng101 commented 8 months ago

Also, just merging in samtools now... Adding docs, then merging into main! This basically implements the pipeline of reads + plasmid sequence -> pileup files (which are the end file format for telling if something is sequence correct or not)

Koeng101 commented 8 months ago

The expectation in the standard library for this kind of thing is that you work with io.Writer / io.Reader

Clarifying this: this is specifically for things that will be written / read that are rather large. Sam files are definitely... this (can be gb to tb in size).