datasets / s-and-p-500-companies

List of companies in the S&P 500 together with associated financials
https://datahub.io/core/s-and-p-500-companies
499 stars 491 forks source link

Correct CSV structure #3

Closed peterdesmet closed 10 years ago

peterdesmet commented 10 years ago

Hi @rgrp,

I noticed the CSVs didn't render correctly on GitHub because of an missing column in row 281. The reason for this is a missing sector for LyondellBasell Industries N.V.. I googled the company and added them to Materials, in both constituents and constituents-financials. Both files now render correctly.

Cheers

rufuspollock commented 10 years ago

@peterdesmet thanks very much for this - and will definitely merge.

As an aside: I'm still not clear whether missing additionals "columns" in a given row make a CSV "invalid" - e.g. these CSV files (in current uncorrected form) render perfectly on data.okfn.org http://data.okfn.org/data/s-and-p-500-companies (and I see lack of additional commas as implicitly meaning the column is blank).

PS: if you are interested in this stuff please check out the Registry (list of datasets we want to prep) and the overall plan - http://data.okfn.org/roadmap/core-datasets

peterdesmet commented 10 years ago

You're welcome! I think the validity of the CSV depends on the software reading them. GitHub considers missing commas invalid, other software might not.

Missing commas meaning blank columns could work, but only for columns at the end. If a field in the "middle" is blank, you would need consecutive commas anyway: data,data,,,data.