dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.93k stars 1.86k forks source link

Microsoft.ML.TimeSeries: Unable to load shared library 'MklImports' or one of its dependencies. #5788

Open Bytezart opened 3 years ago

Bytezart commented 3 years ago

System information

OS: MacOS Big Sur OS Version: 11.3.1 .Net Version: 5.0.202 .Net Framework: .Net Core 3.1 Packages/Versions: Microsoft.ML 1.5.5 Microsoft.ML.TimeSeries 1.5.5 Microsoft.ML.Mkl.Redist 1.5.5

Issue

Executing the following method throws an exception as denoted below. SsaForecastingTransformer.Fit(IDataView input) throws exception Microsoft.ML.TimeSeries: Unable to load shared library 'MklImports' or one of its dependencies.

Source Code

I can produce this error using the example code as found in the method documentation at microsoft.ml.timeseriescatalog.forecastbyssa or in even simpler configurations.

Self Help Troubleshooting

  1. Installed Extra ML.ET Dependencies as instructed @ Install extra ML.NET dependencies - ML.NET | Microsoft Docs

    The following fails on this version of MacOS. However, upon further inspection of the HomeBrew script the bottle SHA signatures were failing to validate. Correcting these seems to have allowed the script to install.

    wget https://raw.githubusercontent.com/Homebrew/homebrew-core/fb8323f2b170bd4ae97e1bac9bf3e2983af3fdb0/Formula/libomp.rb && brew install ./libomp.rb && brew link libomp --force

  2. Enabled DYLD_PRINT_LIBRARIES After enabling this environment variable it appears that dotnet --version command will list all the linked libraries as exampled below. However, the libomp library cannot be found in this list.

    Example Output of dotnet --version where libomp isn't found. dyld: loaded: <23377312-A7DE-3C6C-BCCD-CC68DA1B1898> /usr/local/share/dotnet/dotnet dyld: loaded: /usr/lib/libc++.1.dylib dyld: loaded: /usr/lib/libSystem.B.dylib dyld: loaded: <22AFC7FC-2DB6-3EF0-9CC0-EFFB9B65D5E2> /usr/lib/libc++abi.dylib ... ... ...

Related Issue

Unable to load shared library 'MklImports' or one of its dependencies. · Issue #3903 · dotnet/machinelearning · GitHub

nhirschey commented 3 years ago

I'm having the same issue following some of the F# examples but trying to use OlsTrainer. Is there a solution? @luisquintanilla have you come across a solution to this?

The code is like what is below and comes up with the error after the let capmEstimate line.

We tried the extra dependencies mentioned here for macOS: https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/install-extra-dependencies

#r "nuget:Microsoft.ML,1.5"
#r "nuget:Microsoft.ML.MKL.Components,1.5"

open Microsoft.ML
open Microsoft.ML.Data

type RegData =
    // The ML.NET OLS trainer requires 32bit "single" floats
    { Date : DateTime
      Portfolio : single
      MktRf : single }
// Some quick example data
let longShortRegData = 
  [|{Date = DateTime.Now; Portfolio = single 1.0; MktRf=single 1.2}
    {Date = DateTime.Now; Portfolio = single 1.1; MktRf=single 4.2}
    {Date = DateTime.Now; Portfolio = single 0.9; MktRf=single 3.2}
    |]
let ctx = new MLContext()
let longShortMlData = ctx.Data.LoadFromEnumerable<RegData>(longShortRegData)
let trainer = ctx.Regression.Trainers.Ols()

let capmModel = 
    EstimatorChain()
        .Append(ctx.Transforms.CopyColumns("Label","Portfolio"))
        .Append(ctx.Transforms.Concatenate("Features",[|"MktRf"|])) 
        .Append(trainer)   

let capmEstimate = longShortMlData |> capmModel.Fit

image

But on windows everything works, and I can do

> capmEstimate.LastTransformer.Model;;
val it : Trainers.OlsModelParameters =
  Microsoft.ML.Trainers.OlsModelParameters
    {Bias = 0.9385714531f;
     HasStatistics = true;
     PValues = [|0.1293603629; 0.7877045274|];
     RSquared = 0.1071432954;
     RSquaredAdjusted = 0.0;
     StandardErrors = [|0.1933855408; 0.06185896589|];
     TValues = [|4.853369431; 0.3464098719|];
     Weights = seq [0.02142855711f];}
Bytezart commented 3 years ago

@nhirschey assuming your environment is of a similar configuration as mine then I suspect the root issue we're experiencing is with the OpenMP library libiomp5. This is required by the ML.Net library(s) in certain scenarios. Even after I corrected the Homebrew script and ran it on my test machines I'm not seeing that library in /usr/local/lib/libiomp5.dylib. Even if it was there it needs to be a specific version which I assume may no longer be Version 7. Hopefully there is someone else that can shed further light on this.

nhirschey commented 3 years ago

@michaelgsharp is your pull request here https://github.com/dotnet/machinelearning/pull/5771/commits/fb26308b2998d0da4a74853bf85557c642521a67 fixing the problems we’re having in this issue?

I was trying to figure out if your Big Sur fix was going to make it so that the nuget package was sufficient to get MKL working, or if long term macOS users are still going to have to do brew install and linking to use ML.NET.

It is much much easier to adopt ML.NET when everything comes down via the nuget feed. Even if that means the default feed pulls down an unoptimized BLAS.

michaelgsharp commented 3 years ago

@nhirschey sorry for my delay. Are you running on the new AppleSilicon or the Intel mac?

Bytezart commented 3 years ago

Based on @nhirschey and my prior conversation I believe the root cause is the same. In my case; I'm running Mac on Intel hardware.

michaelgsharp commented 3 years ago

Version 7 is still the correct version for libomp.

I was able to get it working on BigSur by:

I completely understand though your point about everything being included in the nuget feed. Let me talk with my team about that and see if there is anything we can do. The actual dependency itself comes from Intel MKL requiring that very specific version to run. We are going to be updating the Intel MKL version though (I'll be doing that either this week or beginning of next) and I am hoping that will give us more options. Still to be determined though.

nhirschey commented 3 years ago

I was seeing the issues on Intel Mac, and the issue seemed the same as what @Bytezart described. As a result, I had to revert to using Accord.net on Mac.

I appreciate the workaround info. FYI, I was trying to show students new to programming how to use ML.NET, and building from source is a lot to ask of them. #r nuget:... is far more accessible, even if it requires an unoptimized default.

I know intel’s MKL makes things complicated. All I wanted was OlsTrainer, but as far as I could tell that forced the MKL dependency.

Math.NET for example ships with portable code and then allows using MKL if you want an optimized version: https://numerics.mathdotnet.com/MKL.html

Bytezart commented 3 years ago

As often as I looked through the source I obviously missed the brew script in that directory. Ironically, running the script didn't fix the problem on my machine which compelled me to dig a little further into this problem.

Function App Funk As I've discovered; context matters. Running my library in a console application executes perfectly. Running the same code in a Function App fails to execute reporting back this exception. Function Apps have a unique execution model which is probably the source of my problem. Thus, it's likely a runtime problem. Even if I can get it to run locally I'm not sure these dependencies will be available in Azure which is a shame as using this library in the context of a Function App is a very sensible use case.

Bytezart commented 3 years ago

After a lot of debugging of the ML.Net and Azure Function source code I was able to corner this problem. In this instance the true root cause of this exception probably cannot be laid at the feet of the ML.Net libraries. When running in the context of an Azure Function (on a Mac; Win environments work just fine) the DLLImport Attribute Method is failing (probably due to a pathing issue with the DLLName argument) to locate the MKL Libraires. Again, these libraries are present, accessible and also run in a Console Application on a Mac.   The abbreviated process flow is as follows (On a Mac if that wasn't already clear :-))  

  1. Azure Function Method Execution of code calling the Microsoft.ML.Transforms.TimeSeries.SsaForecastingEstimator.Fit(…) method.

  2. Azure Function  Invokes Microsoft.Azure.WebJobs.Host/Executors/TaskMethodInvoker.cs public Task InvokeAsync(TReflected instance, object[] arguments)

  3. ML.Net Invokes Microsoft.ML.TimeSeries/EigenUtils.cs     [DllImport(MklPath, EntryPoint = "LAPACKE_dsteqr", CallingConvention = CallingConvention.Cdecl), SuppressUnmanagedCodeSecurity] public static extern int Dsteqr(Layout matrixLayout, Compz compz, int n, double[] d, double[] e, double[] z, int ldz);

  4. Aforementioned exception is thrown.

I'm going to attempt to run this ML.Net method within a Task in a Console Project and see if I get the same results there and/or verify the Azure Functions environment is be postured correctly on this platform. No matter the outcome I'll attempt package this problem up and take it over to the appropreate Azure Functions GitHub project.

andrelmp commented 1 year ago

I got it to work on OSX, by adding the following nuget package

<PackageReference Include="MKL.NET.osx-x64" Version="2022.0.0.105"/>

quooston commented 1 year ago

I'm running an M1 MB Pro on Ventura and nothing that has been mentioned here has solved the issue for me. @michaelgsharp I tried to build from sources as you did but it doesn't build successfully for me. Anyway, if someone knows something new that will solve the issue on Apple Silicon, please fill me in.

Bytezart commented 1 year ago

Hi @quooston it's hard to follow what your problem is exactly. Were you using the libraries in another solution and getting the exception I originally posted? After much research and trouble shooting I discovered that my problem was not with the ML.Net libraries but rather the Function App Project I was running the ML.Net libraries in. There still is a bug with running Azure Function Apps that reference the ML.Net libraries on a Mac. This issue is due to how the Function App sets environment variables, path and then executes certain ML.Net methods. I had provided Microsoft detailed stack traces but no one seemed interested in fixing the issue :-( so I ended up building my solution on a Windows Rig.