Open numpy-gitbot opened 11 years ago
trac user khaeru wrote on 2012-07-11
Sorry, bad title. Also, what's the difference between the Trac issues list and https://github.com/numpy/numpy/issues ?
Title changed from Remove
to npyio.py: genfromtxt() handles comments incorrectly with names=True
by trac user khaeru on 2012-07-11
atmention:rgommers wrote on 2012-07-12
We opened Github issues only a few weeks ago, we're in the process of transitioning all Trac tickets to it. When that's done we'll close Trac, or make it read-only. For now you can use either one.
atmention:rgommers wrote on 2012-07-12
Suggested fix looks correct.
trac user khaeru wrote on 2012-07-12
Oh, I see — well, I also posted a branch with this fix and a pull request: https://github.com/numpy/numpy/pull/351
Original ticket http://projects.scipy.org/numpy/ticket/2184 on 2012-07-11 by trac user khaeru, assigned to unknown.
The documentation for
genfromtxt()
reads:When the variables are named (either by a flexible dtype or with names, there must not be any header in the file (else a ValueError exception is raised).
and also:
If names is True, the field names are read from the first valid line after the first _skipheader lines.
The cause of this seems to be in [https://github.com/numpy/numpy/blob/master/numpy/lib/npyio.py#L1347 numpy/lib/npyio.py at lines 1347-9]:
The last line should read
first_line = first_line.split(comments)[0]
.With the current code, the input line:
will be transformed to:
resulting in columns named 'Example', 'comment' and 'line' (this is what the warning in the documentation is about).
But also the input line:
will be transformed to:
resulting in columns named 'the', 'column', 'names' …etc. In this instance actual column names present in the file are inappropriately discarded.
By taking the
[0]
portion of the split instead of[1:]
:split_lines()
on L1350, producing no usable output and causing thewhile not first_values
loop to try the next line.