quintel / refinery

Calculates node demands and edge shares for sparse energy graphs.
2 stars 0 forks source link

"^M" in csv files from excel #27

Closed wouterterlouw closed 10 years ago

wouterterlouw commented 10 years ago

I would like to update some *_share.csv files using the output from my analyses. This is needed to adapt some shares and increase the amount of significant numbers.

However, if I compare the file on the server:

➜  shares git:(master) ✗ pwd
/Users/Admin/projects/etsource/data/datasets/nl/shares
➜  shares git:(master) ✗ more industry_final_demand_coal_gas_parent_share.csv
key,share
industry_final_demand_for_metal_coal_gas,1.00
industry_final_demand_for_other_coal_gas,0.00

with the files that are output from the analysis:

➜  export  more industry_final_demand_coal_gas_parent_share.csv
key,share^Mindustry_final_demand_for_metal_coal_gas,1.00000000^Mindustry_final_demand_for_other_coal_gas,0.00000000

In my files there are ^M instead of line breaks. Can I replace the files with my version, do I have to adapt them manually or do I have to do something else?

(I am not sure if this is the right repository for this question.)

antw commented 10 years ago

This ^M has been such a pain recently, you wouldn't believe it. :wink:

The short answer is: don't worry about it. It's Excel playing by it's own rules, but it won't break anything in Atlas.

The longer answer is that Unix and OS X use an invisible character – a \n (line feed) – to indicate a line ending. Windows uses two – \r\n (a carriage return, followed by a line feed). Excel, deciding to do it's own thing, outputs CSVs with only an \r. This shows in in tools like "git diff" and "more" as an ^M.

I've made a note to take another look at this soon, because not being able to use "git diff" and "more" on these files is annoying.

antw commented 10 years ago

@jorisberkhout has updated the VBA script so that it outputs files with Windows line endings. Our tools will work perfectly with that, and you won't ever see the ^M characters again.

FYI @Richard-Deuchler @wmeyers @StijnDellaert: Whenever you commit new or updated CSVs, Git will give you this message indicating that it will convert the line endings to the Unix style:

warning: CRLF will be replaced by LF in data/datasets/nl/shares/some_path.csv
The file will have its original line endings in your working directory.

Thanks Joris! :beer: