zaeleus / noodles

Bioinformatics I/O libraries in Rust
MIT License
477 stars 53 forks source link

Create test SAM data #278

Closed mbhall88 closed 1 month ago

mbhall88 commented 1 month ago

I was wondering if you could suggest a solution for creating SAM records in tests. I know I can obviously create test SAM files and then read those, but I find that very cumbersome and then I have to add those files to VC.

Is there a way I can just create records from a string/byte string?

e.g.,

#[test]
fn test_some_function() {
    let src = &b"qry\t0\tchr\t1\t60\t5M\t*\t0\t0\tGCAGT\t!!!!!\tNM:i:1\t2T2"[..];
    let record = Record::from(src);   // this is essentially the functionality I'd like an example of
}
zaeleus commented 1 month ago

Note that &[u8] implements BufRead, so you can define a test helper that uses Reader::read_record (or Reader::read_record_buf) to build single records, e.g.,

fn build_record(src: &[u8]) -> io::Result<sam::Record> {
    let mut reader = sam::io::Reader::new(src);
    let mut record = sam::Record::default();
    reader.read_record(&mut record)?;
    Ok(record)
}

#[test]
fn test_some_function() -> io::Result<()> {
    let record = build_record(b"qry\t0\tchr\t1\t60\t5M\t*\t0\t0\tGCAGT\t!!!!!\tNM:i:1\t2T2")?;
    // ...
    Ok(())
}

We can consider adding a convenience conversion for sam::Record, given it's a format record specific to SAM; however, sam::alignment::RecordBuf is agnostic to serialized forms.

mbhall88 commented 1 month ago

Wonderful! Boy would I love to do some paired programming with you. This library is so elegant it is beyond my comprehension.

A convenience conversion would be sensational, but not urgent given your function above allows me to write lots of test records now.