zaeleus / noodles

Bioinformatics I/O libraries in Rust
MIT License
477 stars 53 forks source link

question: how to convert cigar from bytes. #233

Closed cauliyang closed 6 months ago

cauliyang commented 6 months ago

now the lib includes two kinds of Cigar Struct. but it seems like none of them can be constructed from bytes.

This cigar pub struct Cigar<'a>(&'a [u8]); provides a new method but it is not a public function. https://github.com/zaeleus/noodles/blob/af84e95376b2b757951aca734f27e2f4c98c38bf/noodles-sam/src/record/cigar.rs#L10

Could we export the new method or parse_cigar? so that we can construct a Cigar from bytes like this :

let cigar = Cigar::new(b"11M120N");

If needed. I can open a new pr.

The libs I use now:


noodles-bam = "0.55.0"
noodles-bgzf = "0.26.0"
noodles-csi = "0.30.0"
noodles-fasta = "0.32.0"
noodles-fastq = "0.10.0"
noodles-sam = "0.52.0"
noodles-core = "0.14"

Thanks for your great help!

zaeleus commented 6 months ago

sam::record::Cigar wasn't actually meant to be user-constructed, but I can see it potentially being used as a non-allocating parser. The visiblity of sam::record::Cigar::new is now increased to be public, e.g., Cigar::new(b"8M13N"). This is added noodles 0.64.0 / noodles-bam 0.53.0.

The libs I use now:

As an aside, I recommend using the noodles meta-crate when adding it as a dependency, i.e.,

noodles = { version = "0.64.0", features = ["bam", "bgzf", "core", "csi", "fasta", "fastq", "sam"] }
cauliyang commented 6 months ago

Appreciate your great help! Yep, I agree with you. Usually, we do not construt Cigar but the feature is helpful in a testing environment. I will follow your suggestions to add the dependency. Have a good day!

theJasonFan commented 4 months ago

@zaeleus thanks for all your work on this crate. What are the chances we can do the same and add noodles::sam::alignment::record_buf::Cigar with signature

pub fn new(src: Vec<u8>) -> Self { ... }

And also expose a function

pub fn as_bytes(&self) -> &[u8]  { ... }

To both ...::record_buf:::Cigar and ...::record::Cigar?

It would be useful generally, and specifically for someone like me who wants to extract the CIGAR string and update the MC tag in BAM/SAM records.

theJasonFan commented 4 months ago

If you'd take a PR I can submit one 😄