zaeleus / noodles

Bioinformatics I/O libraries in Rust
MIT License
477 stars 53 forks source link

Getting raw record lines with lazy records #251

Closed rkimoakbioinformatics closed 4 months ago

rkimoakbioinformatics commented 4 months ago

Hi, I'm developing a program which reads each record, add some INFO fields to it, and write the record to a file. As it will write The original record line plus some, I need the original record line. Is there a way to access the raw record lines through lazy::Record?

zaeleus commented 4 months ago

Since you're manipulating the record, you won't have access to the raw record line. You have to use record buffers (vcf::variant::RecordBuf) and reserialize, e.g.,

main.rs ```rust // cargo add noodles@0.68.0 --features vcf use std::io; use noodles::vcf::{ self, variant::{io::Write, record::info::field::key, record_buf::info::field::Value}, }; static DATA: &[u8] = b"##fileformat=VCFv4.4 #CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO sq0\t1\t.\tA\t.\t.\tPASS\t. "; fn main() -> io::Result<()> { let mut reader = vcf::io::Reader::new(DATA); let header = reader.read_header()?; let stdout = io::stdout().lock(); let mut writer = vcf::io::Writer::new(stdout); writer.write_header(&header)?; for result in reader.record_bufs(&header) { let mut record = result?; record .info_mut() .insert(key::SAMPLES_WITH_DATA_COUNT.into(), Some(Value::from(1))); writer.write_variant_record(&header, &record)?; } // => ##fileformat=VCFv4.4 // #CHROM POS ID REF ALT QUAL FILTER INFO // sq0 1 . A . . PASS NS=1 Ok(()) } ```

If you want to manipulate raw strings, you can get the underlying reader after reading the header and read lines from there, e.g.,

let mut reader = vcf::io::Reader::new(DATA);
let header = reader.read_header()?;

for result in reader.get_mut().lines() {
    let line = result?;
    // ...
}
rkimoakbioinformatics commented 4 months ago

Thanks. I'll try the suggestions.