marcschwartz / WriteXLS

CRAN Package WriteXLS: Cross-platform Perl based R function to create Excel 2003 (XLS) and Excel 2007 (XLSX) files from one or more data frames. Each data frame will be written to a separate named worksheet in the Excel spreadsheet. The worksheet name will be the name of the data frame it contains or can be specified by the user.
GNU General Public License v2.0
20 stars 9 forks source link

error when writing a file in Windows #23

Closed kleinbub closed 2 months ago

kleinbub commented 2 months ago

Hi, I'm using Windows 10 R version 4.4.1 (2024-06-14 ucrt) WriteXLS 6.6.0

WriteXLS::testPerl() runs fine and reports that all modules were found. yet when attempting to write any file i get this error:

> WriteXLS::WriteXLS(iris, "test.xlsx", verbose=T)
Creating Temporary Directory for CSV Files:  C:\Users\Kleinbub\AppData\Local\Temp\RtmpuO4911/WriteXLS 

Creating CSV File: 1.csv
Creating SheetNames.txt

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LC_ALL = (unset),
    LANG = "en"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
. No such file or directory\Kleinbub\AppData\Local\Temp\RtmpuO4911/WriteXLS/1.csv
Creating Excel File: C:\Users\Kleinbub\Documents\test.xlsx

Reading: C:\Users\Kleinbub\AppData\Local\Temp\RtmpuO4911/WriteXLS/1.csv
The Perl script 'WriteXLSX.pl' failed to run successfully.
Cleaning Up Temporary Files and Directory

This issue seems to be the same as the one reported here: https://stackoverflow.com/questions/75749758/r-writexls-doesnt-work-on-newer-r-version

marcschwartz commented 2 months ago

Hi,

Thanks for your report and the pointer to the other report on SO, which I was not aware of as it was never reported to me and I am not active on SO. I should note that I have not received any other reports of these issues running the recent versions of WriteXLS on the recent versions of R.

I do not have access to a native Windows PC, but am running Windows 11 in a Parallels VM on a MacBook Pro with macOS Sonoma, as the latter is where I am typically working.

I did a clean install of R 4.4.1 under Windows 11 using the default installation settings, along with the current Strawberry Perl installation using the MSI file from https://strawberryperl.com, and the current WriteXLS.

Here is the output from the R session, using the regular R Windows GUI (note that I clipped the row output below to shorten it between rows 10 and 140):

R version 4.4.1 (2024-06-14 ucrt) -- "Race for Your Life"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(WriteXLS)
> testPerl()
A system perl installation found in C:\STRAWB~1\perl\bin\perl.exe

The perl modules included with WriteXLS are located in C:/Users/marcschwartz/AppData/Local/R/win-library/4.4/WriteXLS/Perl

All required Perl modules were found.

> WriteXLS(iris, "test.xlsx", verbose = TRUE)
Creating Temporary Directory for CSV Files:  C:\Users\MARCSC~1\AppData\Local\Temp\Rtmp0wkiBh/WriteXLS 

Creating CSV File: 1.csv
Creating SheetNames.txt

Creating Excel File: \\Mac\Home\Documents\test.xlsx

Reading: C:\Users\MARCSC~1\AppData\Local\Temp\Rtmp0wkiBh/WriteXLS/1.csv
Creating New WorkSheet: iris

Parsing CSV File Row: 0
Parsing CSV File Row: 1
Parsing CSV File Row: 1
Parsing CSV File Row: 2
Parsing CSV File Row: 3
Parsing CSV File Row: 4
Parsing CSV File Row: 5
Parsing CSV File Row: 6
Parsing CSV File Row: 7
Parsing CSV File Row: 8
Parsing CSV File Row: 9
Parsing CSV File Row: 10
...
Parsing CSV File Row: 140
Parsing CSV File Row: 141
Parsing CSV File Row: 142
Parsing CSV File Row: 143
Parsing CSV File Row: 144
Parsing CSV File Row: 145
Parsing CSV File Row: 146
Parsing CSV File Row: 147
Parsing CSV File Row: 148
Parsing CSV File Row: 149
Parsing CSV File Row: 150
Cleaning Up Temporary Files and Directory

So, WriteXLS appears to work fine in this setting, which is also consistent with the CRAN checks for the WriteXLS package, across multiple operating systems and R versions, where the examples in ?WriteXLS are run:

https://cran.r-project.org/web/checks/check_results_WriteXLS.html

as well as having run the package through the R win-builder web site (https://win-builder.r-project.org) prior to CRAN submission, for testing WriteXLS for both R release and R devel on Windows specifically.

The one limitation of my testing here on my Mac is the possibility of something being different, either because I am running Windows 11 versus 10, I am using a VM, and/or given the information in the SO post, I am not using any RStudio/Posit/BioC tools, that may be adding some confounding here.

Are you running R within RStudio? If so, can you try running R from the regular R Windows GUI to see if the issue happens there? I would like to confirm that RStudio either is a common factor for you and the SO post, or to eliminate it as a source of problems.

One of the other issues is that your output, but not the SO post output, contains messages regarding the inability to set the locale, presumably within Perl. That might be a red herring, but may indicate some kind of an issue with your Perl installation.

Thus, what Perl installation did you use and which version is it?

Lastly, you did not mention if WriteXLS had been working for you under a prior version of R, or if this is your first attempt at use. The SO post suggested that something between R versions 4.0.2 and 4.2.2 had changed. That may be the R version, or the manner in which the SO user performed the upgrade, including upgrading 3rd party tools like Perl, that could have resulted in a local conflict for them.

Thanks!

kleinbub commented 2 months ago

Dear Carl, thank you for getting back on this issue. I tested on a second windows machine, using a clean .RProfile and running R from command line, and got the same issue. I am using perl 5, version 38, subversion 2 (v5.38.2) built for x86_64-msys-thread-multi which comes preinstalled with RTools 4.4 I singled out the problem at line 261 of WriteXLS.R " --CSVPath ", shQuote(Tmp.Dir), If I remove the shQuote() call and feed to perl the raw Tmp.Dir, it works.

Yet I don't have other platforms to test if this change is breaking.

kleinbub commented 2 months ago

That seemed to work on my office machine, but not on the current one. I'll do more tests and get back to you.

(PS. this is the first time me trying to use WriteXLS)

marcschwartz commented 2 months ago

Hi,

Thanks for the additional information.

It would appear that the source of the issue is the Perl distribution, or perhaps one or more of the additional Perl libraries, that are included in the MSYS2 Perl related packages that are part of RTools 4.4. There is some kind of incompatibility present that is causing the problems in finding the path to the temporary directory, causing the function to fail.

I installed RTools 4.4 x86_64 in my Windows 11 VM, which provides Intel emulation, even though I am running on Apple Silicon (ARM 64).

If I run the test code using the Perl binary that is installed with RTools, I get the following:

> testPerl()
A system perl installation found in C:\rtools44\usr\bin\perl.exe

The perl modules included with WriteXLS are located in C:/Users/marcschwartz/AppData/Local/R/win-library/4.4/WriteXLS/Perl

All required Perl modules were found.

> WriteXLS(iris, "test.xlsx")
ERROR: cannot open C:\Users\MARCSC~1\AppData\Local\Temp\RtmpmKwGGX/WriteXLS/1.csv
. No such file or directory
The Perl script 'WriteXLSX.pl' failed to run successfully.

If I then explicitly point WriteXLS() to the Strawberry Perl installation, I get:

> WriteXLS(iris, "test.xlsx", perl = "C:\\Strawberry\\perl\\bin\\perl")
> 

or using '/' instead of '\\':

> WriteXLS(iris, "test.xlsx", perl = "C:/Strawberry/perl/bin/perl")
> 

and it creates Test.xlsx just fine.

Removing the shQuote() from the code for Tmp.Dir where 'cmd' is created before the system() call, did not resolve the problem for me, as it may have done for you. That is not a surprise, as shQuote() has been stable across OSs and is otherwise benign. Even explicitly using type = "cmd" or type = "cmd2" in the shQuote() call did not resolve the problem.

The reason for using shQuote() is that the file paths and/or names can contain embedded spaces in them, which would then corrupt the arguments passed to the Perl binary on the CLI using system(). Thus, shQuote() surrounds the various arguments with quotes as apropos for the OS and obviates the potential impact of embedded spaces.

If you are not building either R itself, or R package binaries, from source code, such that you really need RTools, I would honestly remove RTools and use a standard Perl distribution instead. Those are really the only reasons to need RTools.

If you are willing to install the Strawberry Perl distribution and use that, I would hope that it would resolve the issue for you as well, and if so, that would be helpful to confirm that the problem is likely in the Perl distribution from MSYS2.

If confirmed, then there is some kind of compatibility issue with the MSYS2 Perl distribution, presumably not in RTools itself, that would take additional time to track down, and then file a bug report with them to resolve it on their end. That is not something that I can do near term, given other commitments.

Thanks,

Marc

marcschwartz commented 2 months ago

Hi,

I have some follow up information thanks to a reply from Ivan Krylov on the r-package-devel list today, in response to a post from me earlier today on this topic.

Overnight, it occurred to me that one of the devs that has a more intimate knowledge of RTools might have some better intuition on the Perl incompatibility issue, and therefore could provide some more expedient guidance here, rather than me going down the rabbit hole of trying to trace the problem in a vacuum.

That presumption was thankfully correct, and it appears that the key is that MSYS2 is based upon Cygwin, thus it attempts to mimic Linux/Unix behavior under Windows. The relevant issue here is that, while R on Windows uses CR-LF sequences for line endings by default in the temporary text files that are output via WriteXLS, the Perl in MSYS2 emulates Linux/Unix behavior, thus uses LF only line endings.

As a result, when using the Perl in MYSYS2/RTools, there is a conflict in the file line endings that corrupts the incoming values from the temporary text files that are created in WriteXLS, and that causes the Perl script to fail.

So the root cause difference appears to be in the PerlIO (Perl Input/Output) layers between the Cygwin version of Perl in MSYS2/RTools and native Windows versions of Perl.

Strawberry Perl, which is natively created for Windows, along with other Perl distributions that have native Windows versions like Active State, use the standard CR-LF line ending sequences, and that is why those Perl distributions work.

Ivan suggested that I add some code to the Perl scripts in WriteXLS to handle the special line ending encoding case when MYSYS2 is in use, which would likely, and apparently, rarely, only be the case when RTools is installed. I say rarely, as yours is the first report of this issue over the years since R 4.x.y was introduced, and where RTools apparently began to include the MSYS2 version of Perl circa 2021/2022. A basic version of Perl (Vanilla Perl) had been part of RTools years ago, until circa R 2.x.y, when it was removed as it was no longer needed.

I also now wonder if that prior SO post that you initially pointed to is related to this same issue, as that person was presumably also using RTools at the time. If so, it may point to some change in RTools between R version 4.0.2 and R version 4.2.0, since the older 4.0.2 version worked for them, but failed under 4.2.2.

I will work on testing an updated version of WriteXLS with the special case handling for MSYS2/RTools to see if the proposed fix works and, if so, do some testing to be sure that it does not cause other problems.

I do not have a time frame for that yet, so in the mean time, the use of Strawberry Perl will be a workaround, if you are willing to install it.

I appreciate your raising this issue here.

Thanks,

Marc

marcschwartz commented 2 months ago

Hi,

I unexpectedly had a window of time this morning to work on this and test a fix for the issue.

The modification to the Perl scripts to adjust the LF versus CRLF encoding in the special case where the RTools (MYSYS2) Perl distribution is being used under Windows appears to work and resolve the issue. I can now run all of the examples in ?WriteXLS when using the RTools Perl distribution without errors.

I also then removed RTools to be sure that the updated scripts would still run with Strawberry Perl on Windows and that also seems to be OK. Lastly, I also tested the updated package under R on macOS, and that appears to be unaffected by the Windows/RTools specific change.

Thus, I have submitted what will become WriteXLS version 6.7.0 to CRAN, and that should show up in the next couple days, as the package is built across the OSs and finds its way to the CRAN mirrors.

I am attaching the Windows ZIP file here, so that in the mean time, if you wish to install and test the updated package as a local ZIP file, you can do so. If you do, please confirm whether or not the fix works for you as well.

Again, thanks for raising this issue.

Regards,

Marc

WriteXLS_6.7.0.zip

kleinbub commented 2 months ago

Dear Marc, thank you so much for the detailed analysis and quick fix! I just tested WriteXLS::WriteXLS(iris, "test.xls", verbose=T) using CRAN version 6.7 and RTools perl, and it works perfectly. As I need to keep RTools installed for package development, this is the most convenient solution but I'll test it out with strawberry perl too, as soon as I find the time. Thank you once more, and kudos.