AdmiralenOla / Scoary

Pan-genome wide association studies
GNU General Public License v3.0
148 stars 35 forks source link

Quoting fields in results file #33

Closed mgalardini closed 8 years ago

mgalardini commented 8 years ago

Hi Ola,

I was giving a go with the latest version (the ascii logo is very neat!) and noticed that the non numeric fields in the output table are not quoted, which might cause problems when parsing the results, especially in the gene product field.

Example of a line which causes my parser to break:

group_3038,,outer membrane pore protein N, non-specific,3,12,3,332,50.0,96.511627907,27.6666666667,0.0011870360825,1.0,0.421032887975,3,3,1,0.125,0.5

(notice the "outer membrane pore protein N, non-specific" bit)

I'm hotfixing this issue by putting an empty string in the "Annotation" field of Roary's output, but I figured you might want to have a look into this potential issue.

Thanks a lot, Marco

AdmiralenOla commented 8 years ago

How did that slip by me?! I'll fix that right away. The problem is this line:

outfile.write(delimiter.join(c for c in outrow) + "\n")

Which should have been:

outfile.write(delimiter.join('"' + c + '"' for c in outrow) + "\n")
AdmiralenOla commented 8 years ago

Thanks for noticing this!