InAnYan / jabref

Graphical Java application for managing BibTeX and biblatex (.bib) databases
https://devdocs.jabref.org
MIT License
0 stars 0 forks source link

Handle error "Failed to load PyTorch native library" #123

Closed koppor closed 3 months ago

koppor commented 3 months ago

image

While trying to reproduce https://github.com/InAnYan/jabref/issues/105, I closed JabRef during downloading. Then I restarted JabRef. Then, I got the error

Failed to load PyTorch native library

Click on "Try to rebuild again" causes the same effect.

koppor commented 3 months ago
ERROR: An error occurred while building the embedding model: ai.djl.engine.EngineException: Failed to load PyTorch native library                                                                                              at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:90)                                                                                                                           at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.engine.PtEngineProvider.getEngine(PtEngineProvider.java:41)                                                                                                             at ai.djl.api@0.29.0/ai.djl.engine.Engine.getEngine(Engine.java:190)                                                                                                                                                   at ai.djl.api@0.29.0/ai.djl.Model.newInstance(Model.java:99)                                                                                                                                                           at ai.djl.api@0.29.0/ai.djl.repository.zoo.BaseModelLoader.createModel(BaseModelLoader.java:196)                                                                                                                       at ai.djl.api@0.29.0/ai.djl.repository.zoo.BaseModelLoader.loadModel(BaseModelLoader.java:159)                                                                                                                         at ai.djl.api@0.29.0/ai.djl.repository.zoo.Criteria.loadModel(Criteria.java:174)                                                                                                                                       at org.jabref@100.0.0/org.jabref.logic.ai.models.DeepJavaEmbeddingModel.<init>(DeepJavaEmbeddingModel.java:23)                                                                                                         at org.jabref@100.0.0/org.jabref.logic.ai.models.EmbeddingModel.rebuild(EmbeddingModel.java:128)                                                                                                                       at org.jabref@100.0.0/org.jabref.gui.util.BackgroundTask$2.call(BackgroundTask.java:91)                                                                                                                                at org.jabref@100.0.0/org.jabref.gui.util.BackgroundTask$2.call(BackgroundTask.java:88)                                                                                                                                at org.jabref@100.0.0/org.jabref.gui.util.UiTaskExecutor$1.call(UiTaskExecutor.java:170)                                                                                                                               at javafx.graphics@22.0.2/javafx.concurrent.Task$TaskCallable.call(Task.java:1399)                                                                                                                                     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)                                                                                                                                          2024-08-05 16:42:54 [JavaFX Application Thread] org.jabref.gui.JabRefDialogService.notify()                                                                                                                                    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)                                                                                                                                   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)                                                                                                                                          INFO: An error occurred while building the embedding model                                                                                                                                                                     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)                                                                                                                           at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)                                                                                                                           at java.base/java.lang.Thread.run(Thread.java:1583)                                                                                                                                                            Caused by: java.lang.UnsatisfiedLinkError: C:\Users\WDAGUtilityAccount\.djl.ai\pytorch\2.3.1-cpu-win-x86_64\fbgemm.dll: Can't find dependent libraries                                                                         at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2418)                                                                                                                                                  at java.base/java.lang.Runtime.load0(Runtime.java:852)                                                                                                                                                                 at java.base/java.lang.System.load(System.java:2025)                                                                                                                                                                   at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.jni.LibUtils.loadNativeLibrary(LibUtils.java:379)                                                                                                                       at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.jni.LibUtils.loadLibTorch(LibUtils.java:195)                                                                                                                            at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.jni.LibUtils.loadLibrary(LibUtils.java:82)                                                                                                                              at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:53)
koppor commented 3 months ago
Caused by: java.lang.UnsatisfiedLinkError: C:\Users\WDAGUtilityAccount\.djl.ai\pytorch\2.3.1-cpu-win-x86_64\fbgemm.dll: Can't find dependent libraries
InAnYan commented 3 months ago

I also had this issue, and I don't know what to do with it...

I tried to restart JabRef and it worked okay

koppor commented 3 months ago

Does not work at my side:

Caused by: java.lang.UnsatisfiedLinkError: C:\Users\vagrant\.djl.ai\pytorch\2.3.1-cpu-win-x86_64\fbgemm.dll: Can't find dependent libraries                                                                                                                       at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2418)                                                        at java.base/java.lang.Runtime.load0(Runtime.java:852)                                                                       at java.base/java.lang.System.load(System.java:2025)                                                                         at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.jni.LibUtils.loadNativeLibrary(LibUtils.java:379)                             at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.jni.LibUtils.loadLibTorch(LibUtils.java:195)                                  at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.jni.LibUtils.loadLibrary(LibUtils.java:82)                                    at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:53)                                 ... 18 more                                                                                      

This refs https://github.com/InAnYan/jabref/issues/105.

Maybe if the download can be triggered explicitly, this can be "solved" by a workaround: User clicks download again. - Or a retry of the download by JabRef automatically? (refs https://github.com/InAnYan/jabref/issues/87)

InAnYan commented 3 months ago

But there is a button "Rebuild"? (It was named "Try to rebuild" some commits ago)

InAnYan commented 3 months ago

Hmm, if we only had ways to reproduce this bug. Because I tried to test it, but it doesn't appear again

koppor commented 3 months ago

On Linux Mint, the download is very fast. Thus, it is a Windows issue.

Reproduce:

  1. Install VirtualBox (Windows howto at https://github.com/JabRef/jabref/tree/main/scripts/vms - you can update for Linux if you run linux)
  2. Install Vagrant
  3. cd scripts/vm/windows
  4. vagrant up
  5. Wait until Windows VM is up (approx 10 Minuts)
  6. Login using vagrant as Password
  7. Open cmd
  8. git clone https://github.com/JabRef/jabref.git
  9. cd jabref
  10. git checkout ai-pr-1
  11. gradlew run
  12. Go to Settings -> Ai
  13. Enable AI
  14. Click on Save
  15. Quit JabRef
  16. Kill the running downloading jobs
  17. Press Ctrl+C at the gradle output to be sure, all is cut

Result:

2024-08-07 10:53:47 [pool-2-thread-3] ai.djl.pytorch.jni.LibUtils.downloadPyTorch()                                                                                                                  INFO: Downloading https://publish.djl.ai/pytorch/2.3.1/cpu/win-x86_64/native/lib/asmjit.dll.gz ...                                                                                                   2024-08-07 10:53:47 [main] org.jabref.Launcher.main()                                                                                                                                                ERROR: Unexpected exception: java.lang.RuntimeException: Exception in Application stop method                                                                                                                at javafx.graphics@22.0.2/com.sun.javafx.application.LauncherImpl.launchApplication1(LauncherImpl.java:898)                                                                                          at javafx.graphics@22.0.2/com.sun.javafx.application.LauncherImpl.lambda$launchApplication$2(LauncherImpl.java:196)                                                                                  at java.base/java.lang.Thread.run(Thread.java:1583)                                                                                                                                          Caused by: java.util.NoSuchElementException: java.lang.IndexOutOfBoundsException                                                                                                                             at java.base/java.util.AbstractList$Itr.next(AbstractList.java:379)                                                                                                                                  at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)                                                                                                                                  at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1939)                                                                                                     at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:762)                                                                                                             at org.jabref@100.0.0/org.jabref.gui.util.UiTaskExecutor.shutdown(UiTaskExecutor.java:131)                                                                                                           at org.jabref@100.0.0/org.jabref.gui.JabRefGUI.shutdownThreadPools(JabRefGUI.java:375)                                                                                                               at org.jabref@100.0.0/org.jabref.gui.JabRefGUI.stop(JabRefGUI.java:365)                                                                                                                              at javafx.graphics@22.0.2/com.sun.javafx.application.LauncherImpl.lambda$launchApplication1$10(LauncherImpl.java:858)                                                                                at javafx.graphics@22.0.2/com.sun.javafx.application.PlatformImpl.lambda$runAndWait$12(PlatformImpl.java:483)                                                                                        at javafx.graphics@22.0.2/com.sun.javafx.application.PlatformImpl.lambda$runLater$10(PlatformImpl.java:456)                                                                                          at java.base/java.security.AccessController.doPrivileged(AccessController.java:400)                                                                                                                  at javafx.graphics@22.0.2/com.sun.javafx.application.PlatformImpl.lambda$runLater$11(PlatformImpl.java:455)                                                                                          at javafx.graphics@22.0.2/com.sun.glass.ui.InvokeLaterDispatcher$Future.run(InvokeLaterDispatcher.java:95)                                                                                           at javafx.graphics@22.0.2/com.sun.glass.ui.win.WinApplication._runLoop(Native Method)                                                                                                                at javafx.graphics@22.0.2/com.sun.glass.ui.win.WinApplication.lambda$runLoop$3(WinApplication.java:184)                                                                                              ... 1 more                                                                                                                                                                                   Caused by: java.lang.IndexOutOfBoundsException                                                                                                                                                               at javafx.base@22.0.2/javafx.collections.transformation.FilteredList.get(FilteredList.java:169)                                                                                                      at com.tobiasdiez.easybind@2.2.1-SNAPSHOT/com.tobiasdiez.easybind.MappedList.get(MappedList.java:31)                                                                                                 at java.base/java.util.AbstractList$Itr.next(AbstractList.java:373)                                                                                                                                  ... 15 more                                                                                                                                                                                  2024-08-07 10:53:47 [pool-2-thread-3] ai.djl.pytorch.jni.LibUtils.downloadPyTorch()                                                                                                                  INFO: Downloading https://publish.djl.ai/pytorch/2.3.1/cpu/win-x86_64/native/lib/libiompstubs5md.dll.gz ...                                                                                          <===========--> 90% EXECUTING [47s]                                                                                                                                                                  > :run                                                                                                                                                                                               ^CTerminate batch job (Y/N)? ^C                                                                                      

grafik

Then Re-Run JabRef (using gradlew run)

Result:

024-08-07 11:01:33 [JavaFX Application Thread] sun.util.logging.internal.LoggingProviderImpl$JULWrapper.log()                                                                                       WARN: Resource "" not found.                                                                                                                                                                         Loading:     100% |========================================|                                                                                                                                         2024-08-07 11:01:34 [JavaFX Application Thread] org.jabref.logic.ai.models.EmbeddingModel.lambda$startRebuildingTask$1()                                                                             ERROR: An error occurred while building the embedding model: ai.djl.engine.EngineException: Failed to load PyTorch native library                                                                            at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:90)                                                                                                         at ai.djl.pytorch_engine@0.29.0/ai.djl.pytorch.engine.PtEngineProvider.getEngine(PtEngineProvider.java:41)                                                                                           at ai.djl.api@0.29.0/ai.djl.engine.Engine.getEngine(Engine.java:190)                                            

grafik


This refs https://github.com/InAnYan/jabref/issues/132 - a broken embedded model should be fixed with a download!

InAnYan commented 3 months ago

OMG so that's a Windows issue?

InAnYan commented 3 months ago

Okay, little bit of research led to, that the user needs to install VC++ redistributable

https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170

InAnYan commented 3 months ago

Wait until Windows VM is up (approx 10 Minuts)

:pensive:

InAnYan commented 3 months ago

From dev call: 1) Check if VC++ should be really installed (if needed, then update docs (not blog, only link to docs)) 2) Add documentation if embedding model download was interrupted in the middle and AI can't be used at all (fix: delete .djl.ai dir)

koppor commented 3 months ago

Windows: choco install vcredist140

Link for self-guided installation: https://aka.ms/vs/16/release/vc_redist.x64.exe

InAnYan commented 3 months ago

Yes, they really need VC++. So, I updated the documentation about that.

And I also added guide, if user closed JabRef in the middle of downloading the embedding model