zaeleus / noodles

Bioinformatics I/O libraries in Rust
MIT License
512 stars 53 forks source link

GTF parser fails when textual attributes contain semicolons #299

Closed kaizhang closed 2 months ago

kaizhang commented 2 months ago

This fails:

chr1\tG\ttranscript\t26\t92\t.\t+\t.\tgene_id \"RGD\"; transcript_id \"XM_5\"; note \"note1;note2\";

This works:

chr1\tG\ttranscript\t26\t92\t.\t+\t.\tgene_id \"RGD\"; transcript_id \"XM_5\"; note \"note1\";

Is this the expected behavior?

zaeleus commented 2 months ago

Thanks for reporting and the example!

This behavior is not intended. I reworked the attributes parser to allow the entry delimiter in the text values.

This fix is now available in noodles 0.80.0 / noodles-gtf 0.31.0.