Open astrofrog opened 7 years ago
Interesting question. The first column is the index, so it doesn't make sense to read it in as "normal column". On the other hand that raises the question whether pandas user should write the index column to the csv if they want to use the csv in other programs...
Doesn't the leading comma in the header row indicate that the first "column" is really the index? This means there should be a reliable way to detect this case. If instead you wrote the same file with df.to_csv('test.csv', index=False)
, then you should just see
In [8]: %more test.csv
a
1
2
3
At the same time, should pandas
fix this on their side too?
@pllim, I don't think it's a bug. I think it's the way that pandas indicates that the first column in a csv is an index, not a real data column. It's possible to give the index column a name, but I believe it's None
by default. See the docs here:
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html
I've already seen examples in the wild where people are trying to read in these files with Astropy and failing. I know it's frustrating but I do think we should support this 'format'
I could volunteer to look into this since I think I'd probably learn something new/useful. But if someone else already has a handle on a fix, that's okay too.
Pandas by default will write CSV files where the header for the first column is missing:
Whether or not this is sensible is debatable, but this means there are a lot of CSV files in the wild missing the first header column name. Astropy doesn't read these in correctly though:
I think we might want to special case this, or deal better with cases like this given how common these kinds of files are going to be.