Illumina / Pisces

Somatic and germline variant caller for amplicon data. Recommended caller for tumor-only workflows.
GNU General Public License v3.0
94 stars 16 forks source link

PG tag in Gemini malformatted #41

Open gwennberry opened 5 years ago

gwennberry commented 5 years ago

Per user, the PG tag "isn't delimited right so when parsing the ID field appears as if it contains everything". Could be related to concatenation of commands when creating the tag? Will check into it.

jinhyunju commented 5 years ago

Current logic in https://github.com/Illumina/Pisces/blame/master/src/lib/Gemini/IO/BamWriterFactory.cs

headers[lastPgHeaderIndex] += ("\n@PG\tID:Gemini PN:Gemini VN:" + geminiVersion + " CL:" + string.Join("", Environment.GetCommandLineArgs()));

Need to change delimiters prior to CL from space to tab according to specifications for SAM/BAM header section format:

"Each header line begins with character @' followed by a two-letter record type code. In the header, each line is TAB-delimited and except the @CO lines, each data eld follows a formatTAG:VALUE' where TAG is a two-letter string that de nes the content and the format of VALUE. Each header line should match: /^@[A-Za-z][A-Za-z](\t[A-Za-z][A-Za-z0-9]:[ -~]+)+$/ or /^@CO\t.*/. Tags containing lowercase letters are reserved for end users."