elki-project / elki

ELKI Data Mining Toolkit
https://elki-project.github.io/
GNU Affero General Public License v3.0
780 stars 321 forks source link

Bugfix: label rows are now also detected when preceded by comment rows #20

Closed patrickkostjens closed 8 years ago

patrickkostjens commented 8 years ago

If an input file starts with one or more comment lines, followed by a line containing labels, those labels are not parsed since originally labels would only be detected on line 1. Since curvec is only set when a vector row is parsed, it will always be null when the label row is parsed. This is therefore a more reliable way to detect label rows.

An alternative way to fix this would be to count non-comment line numbers in the TokenizedReader. However, this is a less elegant solution in my opinion while it also requires a closer coupling between NumberVectorLabelParser and TokenizedReader.

kno10 commented 8 years ago

Merged as bf9b83e5ecd11d370a1546138fdd8fe203ad18ed.

Thank you!