markschl / seq_io

FASTA and FASTQ parsing in Rust
MIT License
68 stars 11 forks source link

Open to a PR for impl Display (or Debug) etc.? #11

Open tfenne opened 1 year ago

tfenne commented 1 year ago

Hi - I'm wondering if you'd be open to a PR that added a few functions and impls that I'd find really useful when writing tests using seq_io.

In short I'd like to add a few functions to the Record traits (and/or impls) to return String or str versions of the fields, and then implement either Debug or Display to show the String versions.

It's not a ton of work, but I want to make sure you'd be open to a PR before creating one. I can see how it might be confusing anyone using the library for the first time, but it would make writing tests to much more pleasurable. Right now my tests are littered with calls to String::from_utf8(...) and similar, and I currently have custom assert_eq() functions for the types so that when they don't match the String forms are printed instead of Vecs of u8s.

Happy to do the work if you'd review and ultimately accept a PR.

markschl commented 1 year ago

Thanks for your interest in contributing, I would be happy about such a PR. The header (ID and description) are already available as strings through the Record::id(), desc() and id_desc() methods so I assume that you would also consider methods returning the sequence, possibly even the quality scores as &str? Apart from seq(), there's also fasta::RefRecord::full_seq() returning Cow<[u8]> (unfortunately not in the Record trait, even though it could be and will be in the next version).

A Debug implementation sounds like a very good idea. Regarding Display I wonder what the best representation would be since it is supposed to be useful output for the user. It could be an option to just write the records to their own format (FASTA records to FASTA, and FASTQ records to FASTQ) with some default settings (line wrap in FASTA?). Display also implements to_string(), for which I find it intuitive if the result would be a formatted record.

Note: I would prefer a PR to be based on the v0.3 version (in a separate branch with that name) because in the meantime I've made a ton of changes to the 0.4-alpha version (not yet pushed, still very rough) and I would prefer to rather import the changes from v0.3 once it's ready.