fslaborg / RProvider

Access R packages from F#
http://fslab.org/RProvider/
Other
235 stars 69 forks source link

IPC port error in RProvider ctor #139

Closed forki closed 9 years ago

forki commented 9 years ago

I'm trying to run https://sergeytihon.wordpress.com/2013/11/18/f-neural-networks-with-rprovider-deedle/ (in FsLab Journal setting) and get

image

"failed to read from IPC port: pipe was terminated"

/cc @sergey-tihon @tpetricek

tpetricek commented 9 years ago

What versions of FSharp.Core and Visual Studio are on the machines?

tpetricek commented 9 years ago

Also, if you have more time to play with this (to build R provider from source), then you can enable logging here: https://github.com/BlueMountainCapital/FSharpRProvider/blob/master/src/RProvider/Logging.fs#L11

... if you could then share the log file, that would be amazing!

forki commented 9 years ago

VS2013 Update 4 and basically all FSharp.Core versions ever. Including self built.

tpetricek commented 9 years ago

(In the next version, it'll be possible to enable logging just by setting environment variable...)

tpetricek commented 9 years ago

Do you have any sort of firewall that might be blocking communication between processes?

forki commented 9 years ago

just normal windows defender - no special settings

forki commented 9 years ago

image

RStudio is working fine.

tpetricek commented 9 years ago

I put a new alpha on NuGet, so if you could try:

forki commented 9 years ago

the new alpha versions seems to work for me. (at least http://bluemountaincapital.github.io/FSharpRProvider/Statistics-QuickStart.html is working)

To make sure we need a Deedle and Deedle.RPlugin and FsLab which is compatible.

forki commented 9 years ago

ok the new alpha version works ok, but if I try to run https://sergeytihon.wordpress.com/2013/11/18/f-neural-networks-with-rprovider-deedle/ then I get

    System.FormatException: Input string was not in a correct format.
       at System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
       at System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info)
       at System.String.System.IConvertible.ToInt32(IFormatProvider provider)
       at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider)
       at System.Convert.ChangeType(Object value, Type conversionType)
       at Deedle.Internal.Convert.changeType[T](Object value) in c:\Tomas\Public\Deedle\src\Deedle\Common\Common.fs:line 1195
       at Deedle.VectorHelpers.changeType@196-1.Invoke(T value) in c:\Tomas\Public\Deedle\src\Deedle\Vectors\VectorHelpers.fs:line 196
       at <StartupCode$Deedle>.$ArrayVector.Deedle-IVector-1-Select@286-1.Invoke(OptionalValue`1 input)
       at Deedle.Vectors.ArrayVector.ArrayVector`1.Deedle-IVector`1-SelectMissing[TNewValue](FSharpFunc`2 f) in c:\Tomas\Public\Deedle\src\Deedle\Vectors\ArrayVector.fs:line 279
       at Deedle.ObjectSeries`1.As[R]() in c:\Tomas\Public\Deedle\src\Deedle\Series.fs:line 1086
       at <StartupCode$FSI_0004>.$FSI_0004.main@() in D:\code\MachineLearning\MLCourse\NeuralNetworkInR.fsx:line 17
    Stopped due to error

so I assume there is still incompatibility with Deedle

sergey-tihon commented 9 years ago

@forki Does it work on your machine with older version?

#I @"..\packages\Deedle.0.9.12"
#I @"..\packages\RProvider.1.0.5\"
#load "RProvider.fsx"
#load "Deedle.fsx
forki commented 9 years ago

no old RProvider is broken for me. Latest RProvider works, but not together with Deedle.

tpetricek commented 9 years ago

I definitely need to publish an updated version of Deedle, but the error might be unrelated.

On which line of @sergey-tihon's script do you get this? And can you share the first few lines of calling iris.Print(true) - this prints the contents of data frame with info about the column types - looks like some conversion is failing.

sergey-tihon commented 9 years ago

I do not remember exactly... But I feel like somebody already asked me about similar issue... I think that Deedle started parse Factors in different way... Previously Deedle returned Factors as numeric column and now is returns them as string. Or something like that... @tpetricek can this be?

If I right it should be line

let targets = R.as_factor(iris.Columns.["Species"].As<int>())
forki commented 9 years ago
open Deedle
open RDotNet
open RProvider
open RProvider.``base``
open RProvider.datasets
open RProvider.neuralnet
open RProvider.caret

// Load data from R to Deedle frame
let iris : Frame<int, string> = R.iris.GetValue()

// Observe iris data set
let features =
    iris
    |> Frame.filterCols (fun c _ -> c <> "Species")
    |> Frame.mapColValues (fun c -> c.As<double>())

this part works and ìris is as follows:

val iris : Frame<int,string> =

       Sepal.Length Sepal.Width Petal.Length Petal.Width Species   
1   -> 5,1          3,5         1,4          0,2         setosa    
2   -> 4,9          3           1,4          0,2         setosa    
3   -> 4,7          3,2         1,3          0,2         setosa    
4   -> 4,6          3,1         1,5          0,2         setosa    
5   -> 5            3,6         1,4          0,2         setosa    
6   -> 5,4          3,9         1,7          0,4         setosa    
7   -> 4,6          3,4         1,4          0,3         setosa    
8   -> 5            3,4         1,5          0,2         setosa    
9   -> 4,4          2,9         1,4          0,2         setosa    
10  -> 4,9          3,1         1,5          0,1         setosa    
11  -> 5,4          3,7         1,5          0,2         setosa    
12  -> 4,8          3,4         1,6          0,2         setosa    
13  -> 4,8          3           1,4          0,1         setosa    
14  -> 4,3          3           1,1          0,1         setosa    
15  -> 5,8          4           1,2          0,2         setosa    
:      ...          ...         ...          ...         ...       
136 -> 7,7          3           6,1          2,3         virginica 
137 -> 6,3          3,4         5,6          2,4         virginica 
138 -> 6,4          3,1         5,5          1,8         virginica 
139 -> 6            3           4,8          1,8         virginica 
140 -> 6,9          3,1         5,4          2,1         virginica 
141 -> 6,7          3,1         5,6          2,4         virginica 
142 -> 6,9          3,1         5,1          2,3         virginica 
143 -> 5,8          2,7         5,1          1,9         virginica 
144 -> 6,8          3,2         5,9          2,3         virginica 
145 -> 6,7          3,3         5,7          2,5         virginica 
146 -> 6,7          3           5,2          2,3         virginica 
147 -> 6,3          2,5         5            1,9         virginica 
148 -> 6,5          3           5,2          2           virginica 
149 -> 6,2          3,4         5,4          2,3         virginica 
150 -> 5,9          3           5,1          1,8         virginica 

but it then fails on:

let targets =
    R.as_factor(iris.Columns.["Species"].As<int>())
sergey-tihon commented 9 years ago

iris.Print(true):

       Sepal.Length Sepal.Width Petal.Length Petal.Width Species   
       (float)      (float)     (float)      (float)     (string) 

So yes, this like should be changed to

let targets = R.as_factor(iris.Columns.["Species"].As<string>())
forki commented 9 years ago

@sergey-tihon did the iris format (column Species) changed?

forki commented 9 years ago

ok so the data format changed. Unrelated to RProvider. all is well

sergey-tihon commented 9 years ago

@forki Not sure about that... But I think that No Species column in iris dataset is factor. image As i know R store factors as numeric value (like an enum) but display string values when you print dataset. Previously RProvider allowed convert factor to int column, but now only string is allowed. image

sergey-tihon commented 9 years ago

Strange, but I see nothing about factors in Passing Data Between F# and R.

If you continue execution of the script you will see another error (cause by conversion to string) on R.neuralnet line:

Error in neurons[[i]] %*% weights[[i]] : 
  requires numeric/complex matrix/vector arguments
In addition: Warning message:
'err.fct' was automatically set to sum of squared error (sse), because the response is not binary 
RDotNet.EvaluationException: Error in neurons[[i]] %*% weights[[i]] : 
  requires numeric/complex matrix/vector arguments

   at RDotNet.REngine.Parse(String statement, StringBuilder incompleteStatement)
   at RDotNet.REngine.<Defer>d__0.MoveNext()
   at System.Linq.Enumerable.LastOrDefault[TSource](IEnumerable`1 source)
   at RDotNet.REngine.Evaluate(String statement)
   at RProvider.RInterop.callFunc(String packageName, String funcName, IEnumerable`1 argsByName, Object[] varArgs) in c:\Tomas\Public\FSharp.RProvider\src\RProvider\RInterop.fs:line 458
   at RProvider.RInterop.call(String packageName, String funcName, String serializedRVal, Object[] namedArgs, Object[] varArgs) in c:\Tomas\Public\FSharp.RProvider\src\RProvider\RInterop.fs:line 489
   at <StartupCode$FSI_0019>.$FSI_0019.main@() in D:\Research&Development\RProjectApps\RProject\blog.fsx:line 43

You can not use string as target for your prediction...

forki commented 9 years ago

Yes I used a dict to convert from string the int class. But even then it doesn't work. Some error in R.net On Jan 13, 2015 9:07 PM, "Sergey Tihon" notifications@github.com wrote:

Strange, but I see nothing about factors in Passing Data Between F# and R http://bluemountaincapital.github.io/FSharpRProvider/passing-data.html.

If you continue execution of the script you will see another error (cause by conversion to string) on R.neuralnet line:

Error in neurons[[i]] %% weights[[i]] : requires numeric/complex matrix/vector arguments In addition: Warning message: 'err.fct' was automatically set to sum of squared error (sse), because the response is not binary RDotNet.EvaluationException: Error in neurons[[i]] %% weights[[i]] : requires numeric/complex matrix/vector arguments

at RDotNet.REngine.Parse(String statement, StringBuilder incompleteStatement) at RDotNet.REngine.d__0.MoveNext() at System.Linq.Enumerable.LastOrDefault[TSource](IEnumerable1 source) at RDotNet.REngine.Evaluate(String statement) at RProvider.RInterop.callFunc(String packageName, String funcName, IEnumerable1 argsByName, Object[] varArgs) in c:\Tomas\Public\FSharp.RProvider\src\RProvider\RInterop.fs:line 458 at RProvider.RInterop.call(String packageName, String funcName, String serializedRVal, Object[] namedArgs, Object[] varArgs) in c:\Tomas\Public\FSharp.RProvider\src\RProvider\RInterop.fs:line 489 at <StartupCode$FSI_0019>.$FSI_0019.main@() in D:\Research&Development\RProjectApps\RProject\blog.fsx:line 43

You can not use string as target for your prediction...

— Reply to this email directly or view it on GitHub https://github.com/BlueMountainCapital/FSharpRProvider/issues/139#issuecomment-69810449 .

sergey-tihon commented 9 years ago

I do not know how to convert string column to factor during passing data back to R. But for now you can use following workaround:

...
let targets =
    R.as_factor(iris.Columns.["Species"])

R.featurePlot(x = features, y = targets, plot = "pairs")

// Replace string column in float column
iris.ReplaceColumn("Species", targets.AsNumeric())

// Split data to training and testing sets (70% vs 30%)
let range = [1..iris.RowCount]
...

and it should work...

tpetricek commented 9 years ago

I think we never quite figured out how to handle factors properly in R provider. But did we (accidentally) change how this is handled?

sergey-tihon commented 9 years ago

@tpetricek Definitely - yes. Deedle.0.9.12 & RProvider.1.0.5 allowed to do

open RProvider.datasets

let iris : Frame<int, string> = R.iris.GetValue()
let targets = iris.Columns.["Species"].As<int>()

but current latest version fails with error

System.FormatException: Input string was not in a correct format.
       at System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
       at System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info)
       at System.String.System.IConvertible.ToInt32(IFormatProvider provider)
       at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider)
       at System.Convert.ChangeType(Object value, Type conversionType)
       at Deedle.Internal.Convert.changeType[T](Object value) in c:\Tomas\Public\Deedle\src\Deedle\Common\Common.fs:line 1195
       at Deedle.VectorHelpers.changeType@196-1.Invoke(T value) in c:\Tomas\Public\Deedle\src\Deedle\Vectors\VectorHelpers.fs:line 196
       at <StartupCode$Deedle>.$ArrayVector.Deedle-IVector-1-Select@286-1.Invoke(OptionalValue`1 input)
       at Deedle.Vectors.ArrayVector.ArrayVector`1.Deedle-IVector`1-SelectMissing[TNewValue](FSharpFunc`2 f) in c:\Tomas\Public\Deedle\src\Deedle\Vectors\ArrayVector.fs:line 279
       at Deedle.ObjectSeries`1.As[R]() in c:\Tomas\Public\Deedle\src\Deedle\Series.fs:line 1086
       at <StartupCode$FSI_0004>.$FSI_0004.main@() in D:\code\MachineLearning\MLCourse\NeuralNetworkInR.fsx:line 17
    Stopped due to error

.As<int>() tries to parse numeric value from string instead of getting value from factor (number of factor level)