RevolutionAnalytics / RRO

Revolution R Open
http://mran.revolutionanalytics.com/download/
GNU General Public License v2.0
86 stars 25 forks source link

IpSolve crashes on RRO due to MKL #218

Closed j-martens closed 9 years ago

j-martens commented 9 years ago

The function “lp” in the package “lpSolve” crashes on RRE (and possible RRO - to be confirmed) and does not in CRAN R, and the cause is the MKL.

We need to find what inside the package was causing MKL to crash.

Here are the related emails:

From: Brian Pinkney

David,

Here is the information from RCurl and lpSolve.

lpSolve – Segmentation fault on Linux If I run this simple example it will segfault on Linux systems – I tested in CentOS 6.5 install.packages("lpSolve") library(lpSolve) f.obj <- c(1, 9, 3) f.con <- matrix (c(1, 2, 3, 3, 2, 2), nrow=2, byrow=TRUE) f.dir <- c("<=", "<=") f.rhs <- c(9, 15) lp ("max", f.obj, f.con, f.dir, f.rhs) * caught segfault * address 0x3, cause 'memory not mapped'

Traceback: 1: .C("lpslink", direction = as.integer(direction), x.count = as.integer(x.count), objective = as.double(objective), const.count = as.integer(const.count), constraints = as.double(constraints), int.count = as.integer(int.count), int.vec = as.integer(int.vec), bin.count = as.integer(bin.count), binary.vec = as.integer(binary.vec), num.bin.solns = as.integer(num.bin.solns), objval = as.double(objval), solution = as.double(solution), presolve = as.integer(presolve), compute.sens = as.integer(compute.sens), sens.coef.from = as.double(sens.coef.from), sens.coef.to = as.double(sens.coef.to), duals = as.double(duals), duals.from = as.double(duals.from), duals.to = as.double(duals.to), scale = as.integer(scale), use.dense = as.integer(use.dense), dense.col = as.integer(dense.col), dense.val = as.double(dense.val), dense.const.nrow = as.integer(dense.const.nrow), dense.ctr = as.double(dense.ctr), use.rw = as.integer(use.rw), tmp = as.character(tmp), status = as.integer(status), PACKAGE = "lpSolve") When I replace the Intel MKL with those included in Open Source R - libRBlas.so and libRlaPack.so the script runs as expected. RCurl – Inconsistent Results between RRE and RStudio and RGui

RStudio Results

x <- getURL("http://microsoft.com", verbose=TRUE)

  • Rebuilt URL to: http://microsoft.com/
  • Hostname was NOT found in DNS cache
  • Trying 134.170.188.221...
  • Connected to microsoft.com (134.170.188.221) port 80 (#0) GET / HTTP/1.1 Host: microsoft.com Accept: /

< HTTP/1.1 301 Moved Permanently < Content-Type: text/html; charset=UTF-8 < Location: http://www.microsoft.com/ < Server: Microsoft-IIS/8.5 < P3P: CP="ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR SAMo CNT COM INT NAV ONL PHY PRE PUR UNI" < X-Powered-By: ASP.NET < X-UA-Compatible: IE=EmulateIE7 < Date: Fri, 14 Aug 2015 15:50:06 GMT < Connection: close < Content-Length: 148 <

RevoIDE – 7.4

x <- getURL("http://microsoft.com", verbose=TRUE)

Note, we get no results - need to get the information -

x [1] "Document Moved\n

Object Moved

This document may be found <a HREF=\"http://www.microsoft.com/\">here"

RGui – 7.4

x <- getURL("http://www.microsoft.com", verbose=TRUE)

No results - needed to get them from x

x (Note – this appeared on a single line – not wrapping. I wrapped for legibility)

[1] "Microsoft Corporation<meta http-equiv=\"X-UA-Compatible\" content=\"IE=EmulateIE7\"><meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\"><meta name=\"SearchTitle\" content=\"Microsoft.com\" scheme=\"\"><meta name=\"Description\" content=\"Get product information, support, and news from Microsoft.\" scheme=\"\"><meta name=\"Title\" content=\"Microsoft.com Home Page\" scheme=\"\"><meta name=\"Keywords\" content=\"Microsoft, product, support, help, training, Office, Windows, software, download, trial, preview, demo, business, security, update, free, computer, PC, server, search, download, install, news\" scheme=\"\"><meta name=\"SearchDescription\" content=\"Microsoft.com Homepage\" scheme=\"\">

Your current User-Agent string appears to be from an automated process, if this is incorrect, please click this link: <a href=\"http://www.microsoft.com/en/us/default.aspx?redir=true\">United States English Microsoft Homepage

\r\n"

From: David Smith

Could someone please send me the reproducible cases mentioned below?

This has been reported to the to lpSolve maintainer and author, we have given them access to reproducible cases but they have not made any efforts toward resolving it, nor have they committed to.

On Fri, Aug 14, 2015 at 12:27 AM -0700, "Richard Kittler" rkittler@microsoft.com wrote:

+David Smith

Rather than create a blacklist, I’d prefer we don’t need one by providing the authors of these anomalous packages with sufficient incentive to fix the problem. David, could you take this on? As I recall there are a few other packages in this category as well. I’ll forward that info separately.

--Rich

From: Sam Kemp

To play Devil’s advocate: Should we stop MKL integration if we do not get 100% compatibility? We often talk about being 100% open source R not 99.99% open source R.

At the very least we should have a ‘black-list’ of packages and some functionality in our software that does a check against that list when a user executes the library function to warn about compatibility. In my case, I was developing on a Windows machine and hence was blinded to the incompatibility with Linux. Had I know this before I went through writing a script around lpSolve, I would have chosen another package.

From: Brian Pinkney

Sam,

This has been reported to the to lpSolve maintainer and author, we have given them access to reproducible cases but they have not made any efforts toward resolving it, nor have they committed to.

A workaround we have offered is to substitute the MKL for the Open Source libraries. An alternative package is Rglpk.

Brian

Sam,

The fix has been applied on the server and once the RStudio session is restarted, the script works. However, when the same script is uploaded to DeployR and executed in the Test Console, it fails with the following error: · Console Error R session execution failed, rse=eval failed · API Error R session execution failed, rse=eval failed Do we have to do anything specific to get DeployR to detect the changes?

Thanks,

From: Sam Kemp [mailto:samkemp@microsoft.com]

You can use the following linux commands (if you have sudo privileges)

find / -name libRblas.so find / -name libRlapack.so

To find the location.

From: Sam Kemp [mailto:samkemp@microsoft.com]

Hi all,

Support have come back and agreed this workaround. The repercussion of this workaround is that open source R functions using linear algebra will not benefit from multithreading. Rx functions will continue to be multithreaded.

Internally, we have raised a JIRA to make the lpSolve package compatible with Linux MKL (n.b. This problem does not occur on windows). Therefore, this workaround will not be required in future releases of RRE.

Many thanks,

Sam

From: Sam Kemp

Hi All,

The lpSolve package is incompatible with the Intel Math Kernel Library (MKL) – there is a workaround (outlined below), but please do not implement this until I have confirmed with support that we are ‘ok’ to proceed with this.

The workaround is as follows:

  1. Find libRblas.so and libRlapack.so, rename them with “.so.MKL” extensions. If you installed Revolution using the default settings then these files are located in /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib
  2. Take the attached .so files and place them in /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib
  3. Restart your R session
  4. Try the code

This resolved the problem on my Linux VM.

Once I have got the all clear from support, I will let you know.

Many thanks,

Sam

Sent: 17 July 2015 12:16

Sam,

Looks like the R session crashes there too. Get a Segmentation Fault at the same place in code.

Segmentation fault. /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/RevoScaleR/rxLibs/x64/libExaCore.so.2(Z21CriticalSignalHandleri+0x1e)[0x7fbab431423e] /lib64/libc.so.6[0x345a6326a0] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libRblas.so(idamax+0x4)[0x7fbac53c0544] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/lpSolve/libs/lpSolve.so(initialize_solution+0x21a)[0x7fbaaf5db3ea] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/lpSolve/libs/lpSolve.so(recompute_solution+0x18)[0x7fbaaf5db6f8] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/lpSolve/libs/lpSolve.so(invert+0x396)[0x7fbaaf5effb6] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/lpSolve/libs/lpSolve.so(spx_run+0x1c0)[0x7fbaaf61b6a0] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/lpSolve/libs/lpSolve.so(solve_LP+0xcd)[0x7fbaaf5f6bcd] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/lpSolve/libs/lpSolve.so(solve_BB+0x5b)[0x7fbaaf5f71cb] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/lpSolve/libs/lpSolve.so(run_BB+0xac)[0x7fbaaf5f769c] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/lpSolve/libs/lpSolve.so(spx_solve+0x4d6)[0x7fbaaf617786] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/lpSolve/libs/lpSolve.so(lin_solve+0xf8)[0x7fbaaf618528] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/library/lpSolve/libs/lpSolve.so(lp_transbig+0x328)[0x7fbaaf61ef38] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(+0x9857f)[0x7fbac7af157f] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(Rf_eval+0x871)[0x7fbac7b2ee91] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(+0xd9727)[0x7fbac7b32727] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(Rf_eval+0x65d)[0x7fbac7b2ec7d] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(+0xd995a)[0x7fbac7b3295a] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(Rf_eval+0x65d)[0x7fbac7b2ec7d] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(Rf_applyClosure+0x45f)[0x7fbac7b20def] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(Rf_eval+0x2f5)[0x7fbac7b2e915] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(+0xd9727)[0x7fbac7b32727] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(Rf_eval+0x65d)[0x7fbac7b2ec7d] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(Rf_ReplIteration+0x212)[0x7fbac7b576a2] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(+0xfea59)[0x7fbac7b57a59] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/lib/libR.so(run_Rmainloop+0x44)[0x7fbac7b57f64] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/bin/exec/R(main+0x1b)[0x40084b] /lib64/libc.so.6(__libc_start_main+0xfd)[0x345a61ed5d] /usr/lib64/Revo-7.3/R-3.1.1/lib64/R/bin/exec/R[0x400739]

From: Sam Kemp [mailto:samkemp@microsoft.com]

Can you putty into the RevR server and type at the command prompt

Revo64

this will launch the CLI for RRE.

Then copy-and-paste your code into the CLI. Do you get an error?

We need to ascertain if the error is on the R engine or RStudio/DeployR test facility. I have seen this error before and it has been on the RStudio server side.

Hi Sam,

I am working with Tom to get the Engineer Optimisation POC for which you wrote a script and passed on to Tom – when we run the script in R Studio Server Pro we consistently get an error that prevents the script running (the R session gets abended).

I have also tried running it in the Test Console in DeployR and the same thing happens at the same stage.

Do you have time to discuss today? If so, I can call you – screenshots of the error etc. below.

Cheers,

Hi All,

Hoping for a little help, for those of you who are unaware Sam from RevolutionR helped convert a SAS optimisation problem we had to run in R. However the R script is unable to run and I get the below error. I have have also taken the example of how to use lpSolve from the web/package and get the same issue, this too is below:

Example code:

amuels[3,4] <- 7; costs[1,3] <- costs[2,4] <- 7.7 costs[5,1] <- costs[7,3] <- 8; costs[1,4] <- 8.4; costs[6,2] <- 9 costs[8,4] <- 10; costs[4,2:4] <- c(.7, 1.4, 2.1)

Set up constraint signs and right-hand sides.

costs row.signs <- rep ("<", 8) row.rhs <- c(200, 300, 350, 200, 100, 50, 100, 150) col.signs <- rep (">", 5) col.rhs <- c(250, 100, 400, 500, 200)

row.signs

Run

lp.transport (costs, "min", row.signs, row.rhs, col.signs, col.rhs)

Not run:Success: the objective function is 7790## End(Not run)

lp.transport (costs, "min", row.signs, row.rhs, col.signs, col.rhs)$solution

Not run:

[,1] [,2] [,3] [,4] [,5] [1,] 0 100 0 100 0 [2,] 0 0 300 0 0 [3,] 0 0 0 350 0 [4,] 200 0 0 0 0 [5,] 50 0 0 0 50 [6,] 0 0 0 0 50 [7,] 0 0 100 0 0 [8,] 0 0 0 50 100

End(Not run)

Error: 16 Jul 2015 15:25:49 [rsession-bensont2] ERROR session hadabend; LOGGED FROM: core::Error::rInit(const r::session::RInitInfo&) /root/rstudio-pro/src/cpp/session/SessionMain.cpp:1710 Checking rgeos availability: TRUE

cid:image001.png@01D0C096.AD0B9830

j-martens commented 9 years ago

From Brian Pickney: Actually the package maintainer got a new version out last month. We updated those impacted to check out the latest version. We tested in house and it appears to have fixed the issues.