r-lib / styler

Non-invasive pretty printing of R code
https://styler.r-lib.org
Other
727 stars 71 forks source link

Trailing quotations are dropped after accents #383

Closed aedobbyn closed 6 years ago

aedobbyn commented 6 years ago

First off, this package is fantastic, so thank you!

Noticed this bug when running style_pkg() today but reproduced using style_file() on the following text. This is the smallest example I could come up with that still reproduces the error. (Sorry no reprex; wasn't sure if it's possible to use style_file() with reprex::reprex().)

text1 <- "Hauràs de dirigir-te al punt de trobada del grup al que et vulguis unir."
text2 <- "i want to buy an iphone"
text3 <- "Je déteste ne plus avoir de dentifrice."

text <- "A panel of Goldman Sachs employees spent a recent Tuesday night at the Columbia University 
faculty club trying to convince a packed room of potential recruits that Wall Street, not Silicon Valley, 
was the place to be for computer scientists.\n\n The Goldman employees knew they had an uphill battle. 
They were fighting against perceptions of Wall Street as boring and regulation-bound and Silicon Valley as 
the promised land of flip-flops, beanbag chairs and million-dollar stock options.\n\n Their argument to the 
room of technologically inclined students was that Wall Street was where they could find far more challenging,
diverse and, yes, lucrative jobs working on some of the worlds most difficult technical problems.\n\n 
Whereas in other opportunities you might be considering, it is working one type of data or one type of 
application, we deal in hundreds of products in hundreds of markets, with thousands or tens of thousands of 
clients, every day, millions of times of day worldwide, Afsheen Afshar, a managing director at Goldman Sachs,
told the students."

The trailing quotation marks at the end of the line with text1 and text3 (the two texts that contain accents) are removed leaving:

text1 <- "Hauràs de dirigir-te al punt de trobada del grup al que et vulguis unir.
text2 <- "i want to buy an iphone"
text3 <- "Je déteste ne plus avoir de dentifrice.

text <- "A panel of Goldman Sachs employees spent a recent Tuesday night at the Columbia University 
faculty club trying to convince a packed room of potential recruits that Wall Street, not Silicon Valley, 
was the place to be for computer scientists.\n\n The Goldman employees knew they had an uphill battle. 
They were fighting against perceptions of Wall Street as boring and regulation-bound and Silicon Valley as 
the promised land of flip-flops, beanbag chairs and million-dollar stock options.\n\n Their argument to the 
room of technologically inclined students was that Wall Street was where they could find far more challenging,
diverse and, yes, lucrative jobs working on some of the worlds most difficult technical problems.\n\n 
Whereas in other opportunities you might be considering, it is working one type of data or one type of 
application, we deal in hundreds of products in hundreds of markets, with thousands or tens of thousands of 
clients, every day, millions of times of day worldwide, Afsheen Afshar, a managing director at Goldman Sachs,
told the students."

The same removal of the trailing quotations happens if those accents are replaced with others, like e.g., umlauts.

I'm running the 1.0.1 development version of the package, r-lib/styler@5733a83.

lorenzwalthert commented 6 years ago

Thanks @aedobbyn for raising this. So you could also use style_text() if you wanted to create a reprex. Does the same behaviour occur if you use style_text()? I was not able to reproduce this behaviour with the exact same version of styler with both style_text() and style_file(). There are two problems that may be related to your issue:

You mentioned that this is the smallest example where the error still persists so I may have missed that your answer to both questions above is no. If so, can you just confirm that? What R version and operating system are you using?

aedobbyn commented 6 years ago

Yes, I should have mentioned this but thought I was doing something wrong (and maybe I am). I originally tried to create a reprex with style_text() but got a different error below.

styler::style_text({
  text1 <- "Hauràs de dirigir-te al punt de trobada del grup al que et vulguis unir."
  text2 <- "i want to buy an iphone"
  text3 <- "Je déteste ne plus avoir de dentifrice."

  text <- "A panel of Goldman Sachs employees spent a recent Tuesday night at the Columbia University 
  faculty club trying to convince a packed room of potential recruits that Wall Street, not Silicon Valley, 
  was the place to be for computer scientists.\n\n The Goldman employees knew they had an uphill battle. 
  They were fighting against perceptions of Wall Street as boring and regulation-bound and Silicon Valley as 
  the promised land of flip-flops, beanbag chairs and million-dollar stock options.\n\n Their argument to the 
  room of technologically inclined students was that Wall Street was where they could find far more challenging,
  diverse and, yes, lucrative jobs working on some of the worlds most difficult technical problems.\n\n 
  Whereas in other opportunities you might be considering, it is working one type of data or one type of 
  application, we deal in hundreds of products in hundreds of markets, with thousands or tens of thousands of 
  clients, every day, millions of times of day worldwide, Afsheen Afshar, a managing director at Goldman Sachs,
  told the students."
})
#> Error in parse(text = text, keep.source = TRUE): <text>:1:3: unexpected symbol
#> 1: A panel
#>       ^

devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.3.3 (2017-03-06)
#>  system   x86_64, darwin13.4.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/Chicago             
#>  date     2018-03-25
#> Packages -----------------------------------------------------------------
#>  package   * version    date       source                          
#>  backports   1.1.2      2017-12-13 cran (@1.1.2)                   
#>  base      * 3.3.3      2017-03-07 local                           
#>  datasets  * 3.3.3      2017-03-07 local                           
#>  devtools    1.13.4     2017-11-09 cran (@1.13.4)                  
#>  digest      0.6.13     2017-12-14 cran (@0.6.13)                  
#>  evaluate    0.10.1     2017-06-24 CRAN (R 3.3.3)                  
#>  graphics  * 3.3.3      2017-03-07 local                           
#>  grDevices * 3.3.3      2017-03-07 local                           
#>  htmltools   0.3.6      2017-04-28 CRAN (R 3.3.3)                  
#>  knitr       1.18       2017-12-27 cran (@1.18)                    
#>  magrittr    1.5        2014-11-22 CRAN (R 3.3.3)                  
#>  memoise     1.1.0      2018-03-24 Github (hadley/memoise@06d16ec) 
#>  methods   * 3.3.3      2017-03-07 local                           
#>  pillar      1.1.0.9000 2018-02-13 Github (hadley/pillar@595d1ac)  
#>  purrr       0.2.4      2017-10-18 CRAN (R 3.3.2)                  
#>  Rcpp        0.12.14    2017-11-23 cran (@0.12.14)                 
#>  rlang       0.2.0.9001 2018-03-26 Github (r-lib/rlang@49d7a34)    
#>  rmarkdown   1.8        2017-11-17 cran (@1.8)                     
#>  rprojroot   1.3-2      2018-01-03 cran (@1.3-2)                   
#>  stats     * 3.3.3      2017-03-07 local                           
#>  stringi     1.1.6      2017-11-17 CRAN (R 3.3.2)                  
#>  stringr     1.2.0      2017-02-18 CRAN (R 3.3.3)                  
#>  styler      1.0.1      2018-03-25 Github (r-lib/styler@5733a83)   
#>  tibble      1.4.2      2018-01-22 CRAN (R 3.3.3)                  
#>  tools       3.3.3      2017-03-07 local                           
#>  utils     * 3.3.3      2017-03-07 local                           
#>  withr       2.1.2      2018-03-24 Github (jimhester/withr@79d7b0d)
#>  yaml        2.1.16     2017-12-12 cran (@2.1.16)

To your question about whether the problem persists when you remove all accented characters: no it does not; the behavior is as expected.

Re: the string length, yes, I should have been more explicit. The length of the the long string is necessary in order to reproduce the example. When it's shortened by even one line the problem goes away.

I'm on MacOS 10.13. R version info below.

R.version
#>                _                           
#> platform       x86_64-apple-darwin13.4.0   
#> arch           x86_64                      
#> os             darwin13.4.0                
#> system         x86_64, darwin13.4.0        
#> status                                     
#> major          3                           
#> minor          3.3                         
#> year           2017                        
#> month          03                          
#> day            06                          
#> svn rev        72310                       
#> language       R                           
#> version.string R version 3.3.3 (2017-03-06)
#> nickname       Another Canoe
lorenzwalthert commented 6 years ago

Ok, you got the error with style_text() because what you inserted above is not code like a <- "hi", but just "hi". So you need double quotes. style_text("'my name is Lorenz'").

aedobbyn commented 6 years ago

Ahh okay. That did not repro the issue, unfortunately.

styler::style_text({'
  text1 <- "Hàuras de dirigir-te al punt de trobada del grup al que et vulguis unir."
  text2 <- "i want to buy an iphone"
  text3 <- "Je déteste ne plus avoir de dentifrice."

  text <- "A panel of Goldman Sachs employees spent a recent Tuesday night at the Columbia University 
  faculty club trying to convince a packed room of potential recruits that Wall Street, not Silicon Valley, 
  was the place to be for computer scientists.\n\n The Goldman employees knew they had an uphill battle. 
  They were fighting against perceptions of Wall Street as boring and regulation-bound and Silicon Valley as 
  the promised land of flip-flops, beanbag chairs and million-dollar stock options.\n\n Their argument to the 
  room of technologically inclined students was that Wall Street was where they could find far more challenging,
  diverse and, yes, lucrative jobs working on some of the worlds most difficult technical problems.\n\n 
  Whereas in other opportunities you might be considering, it is working one type of data or one type of 
  application, we deal in hundreds of products in hundreds of markets, with thousands or tens of thousands of 
  clients, every day, millions of times of day worldwide, Afsheen Afshar, a managing director at Goldman Sachs,
  told the students."'
})
#> 
#> text1 <- "Hàuras de dirigir-te al punt de trobada del grup al que et vulguis unir."
#> text2 <- "i want to buy an iphone"
#> text3 <- "Je déteste ne plus avoir de dentifrice."
#> 
#> text <- "A panel of Goldman Sachs employees spent a recent Tuesday night at the Columbia University 
#>   faculty club trying to convince a packed room of potential recruits that Wall Street, not Silicon Valley, 
#>   was the place to be for computer scientists.
#> 
#>  The Goldman employees knew they had an uphill battle. 
#>   They were fighting against perceptions of Wall Street as boring and regulation-bound and Silicon Valley as 
#>   the promised land of flip-flops, beanbag chairs and million-dollar stock options.
#> 
#>  Their argument to the 
#>   room of technologically inclined students was that Wall Street was where they could find far more challenging,
#>   diverse and, yes, lucrative jobs working on some of the worlds most difficult technical problems.
#> 
#>  
#>   Whereas in other opportunities you might be considering, it is working one type of data or one type of 
#>   application, we deal in hundreds of products in hundreds of markets, with thousands or tens of thousands of 
#>   clients, every day, millions of times of day worldwide, Afsheen Afshar, a managing director at Goldman Sachs,
#>   told the students."
lorenzwalthert commented 6 years ago

Ok. But the styling still fails with style_file() right?

aedobbyn commented 6 years ago

Yes, it does

lorenzwalthert commented 6 years ago

If you are using RStudio: can you save the file with utf8 encoding? That is, go to File -> Save with Encoding -> UTF8? And tell me whether it still fails? Likely, this is already your default encoding since you are on Mac. I am also on mac but I have a later R version and possibly a different version of macOS (High sierra).

aedobbyn commented 6 years ago

Yeah, UTF8 was the default. Saved it again with that just to make sure and tried styler::style_file() again. Same behavior as before of dropping the trailing quotations on the accented lines.

lorenzwalthert commented 6 years ago

We provided a related fix in r-lib/styler#384. Can you check whether the problem occurs with that version? Install relevant branch

remotes::install_github("r-lib/styler#384")

before restarting R(Studio) and then re-style.

aedobbyn commented 6 years ago

Yep, that worked! The quotations were not removed at the end of the lines with accents.

Tested with the master branch version just to make sure and the problem occurred as usual (trailing quotations were removed).

lorenzwalthert commented 6 years ago

Awesome 😊. Will close this when merging #384. Still some minor problems to fix is the there for R 3.1.

aedobbyn commented 6 years ago

Great! Thanks for suggesting the solution. Happy to test again once you've merged to make sure.