HertieDataScience / SyllabusAndLectures

Hertie School of Governance Introduction to Collaborative Social Science Data Analysis
MIT License
37 stars 60 forks source link

Error when reading a xls file ("more columns than column names") #58

Closed mr-r0b0t closed 8 years ago

mr-r0b0t commented 8 years ago

Hi everyone,

I'm having a problem reading a xls file in R Studio due to the following error message (using the "gdata" package and the command "read.xls"):

"Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names"

In particular, I'm trying to import a xls file (http://www.pcr.uu.se/digitalAssets/159/159834_1external_support_compact_dataset_1.00_20110325-1.xls) in order to merge it together with two other datasets and to create one single data frame.

I was trying some things and looked into online forums discussing possible solutions, but I just couldn't fix it on my end... I'm assuming it's not a huge issue, but if someone has had a similar problem or knows how to fix this I would be very happy!

Here's the link to the R Script file including the code:

https://github.com/KatrinHeger/CollaborativeResearchProject/blob/master/Importing_UCDP_ES.R

Thank's a lot!

Best,

Benedikt

LarsMehwald commented 8 years ago

Dear Benedikt, Daniel and I had a similar problem. It related to the fact, that the .xls file was not nicely formated: containing a "header" with information regarding the file, not all variables were named and some copyright info at the end of the file. We used the options "skip" to get rid of the header including the variable names, used "col.names" to manually name the variables and "nrows" to limit the number of rows loaded (and thereby getting rid of the additional info at the end of the document). For us it worked then. Hope it helps, Lars

mr-r0b0t commented 8 years ago

Dear Lars,

Thank you very much for your quick reply and help! Did you use a specific package to perform those commands or are they already part of the basic R Studio suite?

Thanks a lot!

Best,

Benedikt

LarsMehwald commented 8 years ago

Dear Benedikt, These options are part of the read.table command from the utils package (usually attached to R). Here is an example: read.table(file = "", sep="", na.strings = "", nrows = , skip = , header=FALSE, col.names=c(...)) Hope this helps, Lars

mr-r0b0t commented 8 years ago

Thank's a lot and have a nice Saturday evening! :-)

Best,

Benedikt


From: LarsMehwald notifications@github.com<mailto:notifications@github.com> Sent: Saturday, November 7, 2015 8:57 PM Subject: Re: [SyllabusAndLectures] Error when reading a xls file ("more columns than column names") (#58) To: HertieDataScience/SyllabusAndLectures syllabusandlectures@noreply.github.com<mailto:syllabusandlectures@noreply.github.com> Cc: mr-r0b0t benediktabendroth@outlook.com<mailto:benediktabendroth@outlook.com>

Dear Benedikt, These options are part of the read.table command from the utils package (usually attached to R). Here is an example: read.table(file = "", sep="", na.strings = "", nrows = , skip = , header=FALSE, col.names=c(...)) Hope this helps, Lars

Reply to this email directly or view it on GitHubhttps://github.com/HertieDataScience/SyllabusAndLectures/issues/58#issuecomment-154743334.

christophergandrud commented 8 years ago

Also rio should work well:

library(rio)

main <- import('159834_1external_support_compact_dataset_1.00_20110325-1.xls')