noamross / zero-dependency-problems

Real-world code problems in R, without dependencies
79 stars 25 forks source link

"deconstructing" error messages #7

Open dpastoor opened 9 years ago

dpastoor commented 9 years ago

One thing I've dealt with frequently is newer people don't understand some of the basics of R messages that R provides.

For example, given the following code (mirroring issue #1)

c(1, 2, 3, 4 5, 6, 7, 8)

and the resulting error:

Error: unexpected numeric constant in "c(1, 2, 3, 4 5"

people commonly get caught up trying to decipher what 'unexpected numeric constant' means, rather than first understanding that in most typographical errors, R will print up to the last place it evaluated before the error occured, so before going crazy googling just look at the code itself.

below is a pretty solid stack overflow response on the matter

These errors mean that the R code you are trying to run or source is not syntactically correct. That is, you have a typo.

To fix the problem, read the error message carefully. The code provided in the error message shows where R thinks that the problem is. Find that line in your original code, and look for the typo.


Prophylactic measures to prevent you getting the error again

The best way to avoid syntactic errors is to write stylish code. That way, when you mistype things, the problem will be easier to spot. There are many R style guides linked from the SO R tag info page. You can also use the formatR package to automatically format your code into something more readable.

Consider using an IDE or text editor that highlights matching parentheses and braces, and shows strings and numbers in different colours.


Common syntactic mistakes that generate these errors

Mismatched parentheses, braces or brackets

If you have nested parentheses, braces or brackets it is very easy to close them one too many or too few times.

{}}
## Error: unexpected '}' in "{}}"
{{}} # OK

Missing * when doing multiplication

This is a common mistake by mathematicians.

5x
Error: unexpected symbol in "5x"
5*x # OK

Not wrapping if, for, or return values in parentheses

This is a common mistake by MATLAB users. In R, if, for, return, etc., are functions, so you need to wrap their contents in parentheses.

if x > 0 {}
## Error: unexpected symbol in "if x"
if(x > 0) {} # OK

Not using multiple lines for code

Trying to write multiple expressions on a single line, without separating them by semicolons causes R to fail, as well as making your code harder to read.

x + 2 y * 3
## Error: unexpected symbol in "x + 2 y"
x + 2; y * 3 # OK

else starting on a new line

In an if-else statement, the keyword else must appear on the same line as the end of the if block.

if(TRUE) 1
else 2
## Error: unexpected 'else' in "else"    
if(TRUE) 1 else 2 # OK
if(TRUE) 
{
  1
} else            # also OK
{
  2
}

= instead of ==

= is used for assignment and giving values to function arguments. == tests two values for equality.

if(x = 0) {}
## Error: unexpected '=' in "if(x ="    
if(x == 0) {} # OK

Missing commas between arguments

When calling a function, each argument must be separated by a comma.

c(1 2)
## Error: unexpected numeric constant in "c(1 2"
c(1, 2) # OK

Not quoting file paths

File paths are just strings. They need to be wrapped in double or single quotes.

path.expand(~)
## Error: unexpected ')' in "path.expand(~)"
path.expand("~") # OK

Quotes inside strings

This is a common problem when trying to pass quoted values to the shell via system, or creating quoted xPath or sql queries.

Double quotes inside a double quoted string need to be escaped. Likewise, single quotes inside a single quoted string need to be escaped. Alternatively, you can use single quotes inside a double quoted string without escaping, and vice versa.

"x"y"
## Error: unexpected symbol in ""x"y"   
"x\"y" # OK
'x"y'  # OK  

Using curly quotes

So-called "smart" quotes are not so smart for R programming.

path.expand(“~”)
## Error: unexpected input in "path.expand(“"    
path.expand("~") # OK

Using non-standard variable names without backquotes

?make.names describes what constitutes a valid variable name. If you create a non-valid variable name (using assign, perhaps), then you need to access it with backquotes,

assign("x y", 0)
x y
## Error: unexpected symbol in "x y"
`x y` # OK

This also applies to column names in data frames created with check.names = FALSE.

dfr <- data.frame("x y" = 1:5, check.names = FALSE)
dfr$x y
## Error: unexpected symbol in "dfr$x y"
dfr[,"x y"] # OK
dfr$`x y`   # also OK

It also applies when passing operators and other special values to functions. For example, looking up help on %in%.

?%in%
## Error: unexpected SPECIAL in "?%in%"
?`%in%` # OK

Sourcing non-R code

The source function runs R code from a file. It will break if you try to use it to read in your data. Probably you want read.table.

source(textConnection("x y"))
## Error in source(textConnection("x y")) : 
##   textConnection("x y"):1:3: unexpected symbol
## 1: x y
##       ^

Corrupted RStudio desktop file

RStudio users have reported erroneous source errors due to a corrupted .rstudio-desktop file. These reports only occurred around March 2014, so it is possibly an issue with a specific version of the IDE. RStudio can be reset using the instructions on the support page.


Using expression without paste in mathematical plot annotations

When trying to create mathematical labels or titles in plots, the expression created must be a syntactically valid mathematical expression as described on the ?plotmath page. Otherwise the contents should be contained inside a call to paste.

plot(rnorm(10), ylab = expression(alpha ^ *)))
## Error: unexpected '*' in "plot(rnorm(10), ylab = expression(alpha ^ *"
plot(rnorm(10), ylab = expression(paste(alpha ^ phantom(0), "*"))) # OK

from here

noamross commented 9 years ago

This is a great, as "common R error messages" is an outstanding task for the SWC R repo and will go into this lesson. It should probably be combined in the same material as "Googling error messages."

This is only one error message, with many cases. I would take some out for brevity once one gets across that it's almost always a syntax error

What errors to add?: Some common ones from here:

I would add:

> paste('hello', 'there',)
Error in paste("hello", "there", ) : argument is missing, with no default
``

And of course, using `traceback()`.  Other debugging tools might be much for a novice workshop.
dpastoor commented 9 years ago

don't know it would be within scope but an error that comes up a ton, especially when using Rmd/knitr is making sure a file actually exists.

Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'file.csv': No such file or directory

A lot of people get stopped at the cannot open connection and don't read the whole message through or don't understand why it 'doesn't exist' when in reality it is a path issue. I've found it is good to explicitly cover this error when introducing relative paths and/or knitting

dpastoor commented 9 years ago

woops just saw you had the no file or directory in your bulleted list

noamross commented 9 years ago

Another common one one:

Error in [SOMETHING]: object of type 'closure' is not subsettable

Usually means you have a function where you should have some data. Probably a syntax error where the user put the name of the function rather than the name of the variable, or left out parentheses (e.g., 'function()').

Found in a StackOverflow search. To look at this empirically, one could use the SO API to look at the frequency of various error messages that people run into.

noamross commented 9 years ago

I did some empirical investigation as to what R errors are most commonly reported on Stack Overflow: https://github.com/noamross/zero-dependency-problems/blob/master/misc/stack-overflow-common-r-errors.md