Open yfarjoun opened 6 years ago
It's not just the header that is vulnerable to having illegal characters injected. No checks are done on the record fields either. Since htsjdk is not VCFv4.3 aware, it does no encoding of illegal character in info or format field (which is fine) but it also does not prevent those characters being written so it just outputs non-compliant records.
Sorry, I'm mixing up my specs documents here. The issue applies to SAM record tags as well. TextTagCodec.java:82 just does sb.append(tagType).append(':').append(value.toString());
when emitting string fields.
No validation is done in VCF either (see #1063 for a example of this being a problem with GATK output) but presumably that's a different issue.
One can set a header attribute (like DS) to be "text with INJECTION\tPI:123" and then the output (Sam or Bam) file that is written will have a field PI:123.