VowpalWabbit / vowpal_wabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
https://vowpalwabbit.org
Other
8.48k stars 1.93k forks source link

"make java" issue #4515

Closed suming closed 1 year ago

suming commented 1 year ago

Describe the bug

I cloned via git clone --recursive https://github.com/VowpalWabbit/vowpal_wabbit.git

make java

results in

make[1]: Entering directory '/Users/such/code/vw2/vowpal_wabbit/build'
make[1]: *** No rule to make target 'vw_jni'.  Stop.
make[1]: Leaving directory '/Users/such/code/vw2/vowpal_wabbit/build'
make: *** [Makefile:44: java_build] Error 2

I have java installed:

~/c/v/vowpal_wabbit ❯❯❯ java -version ✘ 1 newmaster ✱ ◼ openjdk version "1.8.0_275" OpenJDK Runtime Environment (Zulu 8.50.0.1017-CA-macos-aarch64) (build 1.8.0_275-b01) OpenJDK 64-Bit Server VM (Zulu 8.50.0.1017-CA-macos-aarch64) (build 25.275-b01, mixed mode)



### How to reproduce

clone the repo, run `make java` on 9.6.0

### Version

9.6.0

### OS

macOS

### Language

java

### Additional context

_No response_
jackgerrits commented 1 year ago

Those makefiles are just think wrappers around some cmake commands and aren't used much so might not be up to date (as is the case here).

Please see Java build instructions here: https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Java#build-from-source

suming commented 1 year ago

Thanks for the tip, following those instructions exactly, before I run

make -j$(nproc)

I assume I need to be in the root directory right? Since running from the 'build/' directory gives me

make: *** No targets specified and no makefile found.  Stop.
suming commented 1 year ago

Nvm, re-cloned and ran from a fresh directory. It's building in that directory now, but now the issue is when it gets to 100%, I see

[100%] Linking CXX executable vw-unit-test.out
ld: warning: ignoring file /opt/homebrew/lib/libboost_unit_test_framework-mt.dylib, building for macOS-x86_64 but attempting to link with file built for macOS-arm64
Undefined symbols for architecture x86_64:

I am on arm64, so not sure where I can change the instructinos to build for arm64 instead of x86_64

suming commented 1 year ago

Trying the instructions here

suming commented 1 year ago

@jackgerrits when I run the final command from the "Build from Source" section on the link you shared, I see build failures still:

[  1%] Built target gtest
[  2%] Built target gmock
[  2%] Built target gmock_main
[  2%] Built target gtest_main
[  2%] Built target fmt
[  4%] Built target spdlog
[  4%] Built target vw_io
[  5%] Built target vw_allreduce
[  8%] Built target vw_config
[ 56%] Built target vw_core
[ 57%] Built target library_example
[ 57%] Built target test_search
[ 58%] Built target search_generate
[ 58%] Built target recommend
[ 59%] Built target gd_mf_weights
[ 59%] Built target vw_active_interactor_bin
[ 60%] Built target vw_c_wrapper
[ 61%] Built target vw_c_wrapper_test
[ 61%] Built target vw_cli_bin
[ 62%] Built target vw_config_test
[ 65%] Built target vw_core_test
[ 66%] Built target vw_explore_test
[ 68%] Built target vw_io_test
[ 69%] Built target vw_model_merger_bin
[ 72%] Built target vw_slim
[ 73%] Built target vw_slim_test
[ 74%] Built target vw_spanning_tree
[ 75%] Built target vw_spanning_tree_bin
[ 76%] Built target vw-dump-options
[ 77%] Built target vw_jni_generate_native_headers_do_not_use_jar
[ 77%] Linking CXX shared library libvw_jni.dylib
Copying shared libary dependencies to output directory
[INFO] Scanning for projects...
[INFO] Inspecting build with total of 1 modules...
[INFO] Installing Nexus Staging features:
[INFO]   ... total of 1 executions of maven-deploy-plugin replaced with nexus-staging-maven-plugin
[INFO] 
[INFO] -------------------< com.github.vowpalwabbit:vw-jni >-------------------
[INFO] Building Vowpal Wabbit JNI Layer 9.6.0
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- maven-enforcer-plugin:1.1:enforce (enforce-ban-duplicate-classes) @ vw-jni ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.1:enforce (enforce-ban-version-downgrades) @ vw-jni ---
[INFO] 
[INFO] --- maven-resources-plugin:2.7:resources (default-resources) @ vw-jni ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] 
[INFO] --- maven-compiler-plugin:3.8.0:compile (default-compile) @ vw-jni ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 35 source files to /Users/such/code/vw3/vowpal_wabbit/java/target/classes
[INFO] 
[INFO] --- maven-resources-plugin:2.7:copy-resources (copy-resources) @ vw-jni ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
[INFO] 
[INFO] --- maven-resources-plugin:2.7:testResources (default-testResources) @ vw-jni ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
[INFO] 
[INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @ vw-jni ---
[INFO] Nothing to compile - all classes are up to date
[INFO] 
[INFO] --- maven-surefire-plugin:2.17:test (default-test) @ vw-jni ---
[INFO] Surefire report directory: /Users/such/code/vw3/vowpal_wabbit/java/target/surefire-reports

-------------------------------------------------------
 T E S T S
-------------------------------------------------------

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running vowpalWabbit.ClosingTest
Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 0.046 sec <<< FAILURE! - in vowpalWabbit.ClosingTest
testCloseAsyncIsNearImmediate(vowpalWabbit.ClosingTest)  Time elapsed: 0.018 sec  <<< ERROR!
java.lang.ExceptionInInitializerError: null
    at java.util.zip.ZipFile.open(Native Method)
    at java.util.zip.ZipFile.<init>(ZipFile.java:230)
    at java.util.zip.ZipFile.<init>(ZipFile.java:160)
    at java.util.jar.JarFile.<init>(JarFile.java:168)
    at java.util.jar.JarFile.<init>(JarFile.java:132)
    at common.Native.try_load_from_jar(Native.java:26)
    at common.Native.<clinit>(Native.java:60)
    at vowpalWabbit.learner.VWLearners.<clinit>(VWLearners.java:22)
    at vowpalWabbit.ClosingTest.model(ClosingTest.java:185)
    at vowpalWabbit.ClosingTest.model(ClosingTest.java:171)
    at vowpalWabbit.ClosingTest.testCloseAsyncIsNearImmediate(ClosingTest.java:40)

multipleAsynchronousCloserCallsReturnTrueOnlyOnce(vowpalWabbit.ClosingTest)  Time elapsed: 0 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class vowpalWabbit.learner.VWLearners
    at vowpalWabbit.ClosingTest.model(ClosingTest.java:185)
    at vowpalWabbit.ClosingTest.model(ClosingTest.java:171)
    at vowpalWabbit.ClosingTest.multipleAsynchronousCloserCallsReturnTrueOnlyOnce(ClosingTest.java:97)

closerThenClose(vowpalWabbit.ClosingTest)  Time elapsed: 0 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class vowpalWabbit.learner.VWLearners
    at vowpalWabbit.ClosingTest.model(ClosingTest.java:185)
    at vowpalWabbit.ClosingTest.model(ClosingTest.java:171)
    at vowpalWabbit.ClosingTest.closerThenClose(ClosingTest.java:69)

testRegisteringClosingCallbackWithLongDelay(vowpalWabbit.ClosingTest)  Time elapsed: 0.001 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class vowpalWabbit.learner.VWLearners
    at vowpalWabbit.ClosingTest.model(ClosingTest.java:185)
    at vowpalWabbit.ClosingTest.model(ClosingTest.java:171)
    at vowpalWabbit.ClosingTest.testClosingCallback(ClosingTest.java:143)
    at vowpalWabbit.ClosingTest.testRegisteringClosingCallbackWithLongDelay(ClosingTest.java:133)

...


[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  2.732 s
[INFO] Finished at: 2023-02-28T16:01:23-08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on project vw-jni: There are test failures.
[ERROR]
[ERROR] Please refer to /Users/such/code/vw3/vowpal_wabbit/java/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
make[2]: *** [java/CMakeFiles/vw_jni.dir/build.make:318: java/libvw_jni.dylib] Error 1
make[2]: *** Deleting file 'java/libvw_jni.dylib'
make[1]: *** [CMakeFiles/Makefile2:2283: java/CMakeFiles/vw_jni.dir/all] Error 2
make: *** [Makefile:121: all] Error 2

any idea?

suming commented 1 year ago

Some more context -- I have maven and jdk both installed: Apache Maven 3.9.0 (9b58d2bad23a66be161c4664ef21ce219c2c8584) Maven home: /opt/homebrew/Cellar/maven/3.9.0/libexec Java version: 1.8.0_275, vendor: Azul Systems, Inc., runtime: /usr/local/lib/java/Contents/Home/jre Default locale: en_US, platform encoding: UTF-8 OS name: "mac os x", version: "12.5.1", arch: "aarch64", family: "mac"

I ran exactly

git clone --recursive https://github.com/VowpalWabbit/vowpal_wabbit.git
cd vowpal_wabbit/
mkdir build
cd build
cmake -DBUILD_JAVA=ON ..
make

after running cmake command, I see

cmake -DBUILD_JAVA=ON ..                                                                        ✘ 130 master
-- VowpalWabbit Version: 9.7.0
-- No build type selected, defaulting to Release
-- The C compiler identification is AppleClang 13.1.6.13160021
-- The CXX compiler identification is AppleClang 13.1.6.13160021
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Using C++ standard: 11
CMake Warning (dev) at /opt/homebrew/Cellar/cmake/3.25.2/share/cmake/Modules/FetchContent.cmake:1284 (message):
  The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
  not set.  The policy's OLD behavior will be used.  When using a URL
  download, the timestamps of extracted files should preferably be that of
  the time of extraction, otherwise code that depends on the extracted
  contents might not be rebuilt if the URL changes.  The OLD behavior
  preserves the timestamps from the archive instead, but this is usually not
  what you want.  Update your project to the NEW behavior or specify the
  DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this
  robustness issue.
Call Stack (most recent call first):
  cmake/VowpalWabbitUtils.cmake:23 (FetchContent_Declare)
  CMakeLists.txt:85 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found Python: /opt/homebrew/Frameworks/Python.framework/Versions/3.11/bin/python3.11 (found version "3.11.2") found components: Interpreter
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found Git: /opt/homebrew/bin/git (found version "2.39.0")
-- Git Version: c0ba1804b
-- Module support is disabled.
-- Version: 9.1.0
-- Build type: Release
-- CXX_STANDARD: 11
-- Performing Test has_std_11_flag
-- Performing Test has_std_11_flag - Success
-- Performing Test has_std_0x_flag
-- Performing Test has_std_0x_flag - Success
-- Required features: cxx_variadic_templates
-- Performing Test HAS_NULLPTR_WARNING
-- Performing Test HAS_NULLPTR_WARNING - Success
-- Build spdlog: 1.11.0
-- Build type: Release
-- Boost.Math: standalone mode ON
-- Found ZLIB: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.3.sdk/usr/lib/libz.tbd (found version "1.2.11")
-- help2man not found, please install it to generate manpages
-- Found JNI: /usr/local/lib/java/Contents/Home/include  found components: AWT JVM
-- Found Java: /usr/local/lib/java/Contents/Home/bin/java (found version "1.8.0.275")
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/such/code/vw4/vowpal_wabbit/build

Then when running make I see the same error as above,


[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  6.154 s
[INFO] Finished at: 2023-02-28T16:19:55-08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on project vw-jni: There are test failures.
[ERROR]
[ERROR] Please refer to /Users/such/code/vw4/vowpal_wabbit/java/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

where (maybe?) due to these errors, compiled JNI library is not found at anywhere in the directory.

jackgerrits commented 1 year ago

I know for the released jar the library loading logic is only implemented for Linux x86 (it searches in the jar). It's possible the loading logic is not implemented for Mac or Mac arm. I'm not sure as it's been a while since I've looked at this.

See related #2521, #1579

suming commented 1 year ago

Thanks for the response. Is there any version in the past, e.g. 8.x.x where Mac arm is supported?

Any other workarounds that may be possible?

Thanks

On Tue, Feb 28, 2023, 5:19 PM Jack Gerrits @.***> wrote:

I know for the released jar the library loading logic is only implemented for Linux x86 (it searches in the jar). It's possible the loading logic is not implemented for Mac or Mac arm. I'm not sure as it's been a while since I've looked at this.

— Reply to this email directly, view it on GitHub https://github.com/VowpalWabbit/vowpal_wabbit/issues/4515#issuecomment-1449172729, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF4TV63KCRNXJ23UGXS5DDWZ2P3JANCNFSM6AAAAAAVLKBBV4 . You are receiving this because you authored the thread.Message ID: @.***>

jackgerrits commented 1 year ago

Mac arm is not explicitly supported or unsupported. The Jar has only ever had Linux x86 in it though. For usage without the jar this code tries to load the lib, so you may need to move the built vw_jni shared object or update the path Java is using to search for the shared object

suming commented 1 year ago

thanks for the response!

The issue is I can't build vw_jni shared object with ARM. Does anyone have a built vw_jni version they can share?

On Wed, Mar 1, 2023, 9:35 AM Jack Gerrits @.***> wrote:

Mac arm is not explicitly supported or unsupported. The Jar has only ever had Linux x86 in it though. For usage without the jar this code https://github.com/VowpalWabbit/vowpal_wabbit/blob/c0ba1804be1b94a12db1b7ef5d891af24ff37748/java/src/main/java/common/Native.java#L14 tries to load the lib, so you may need to move the built vw_jni shared object or update the path Java is using to search for the shared object

— Reply to this email directly, view it on GitHub https://github.com/VowpalWabbit/vowpal_wabbit/issues/4515#issuecomment-1450561153, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF4TVZGG2IHXX5KYX7J2UDWZ6CGFANCNFSM6AAAAAAVLKBBV4 . You are receiving this because you authored the thread.Message ID: @.***>

jackgerrits commented 1 year ago

The build error you posted was related to unit tests which aren't needed to get vw_jni built, did vw_jni itself build successfully?

Use cmake --build build -t vw_jni just just build vw_jni

suming commented 1 year ago

Thanks for the response, sorry I wasn't clear before -- vw_jni doesn't look like it is getting built. The only vw_jni that I see is built for linux_64. Is that a bug and that is actually OK to use for mac?

classes/natives/linux_64/libvw_jni.dylib

Btw, even when I run cmake --build build -t vw_jni, I see a variety of test errors:


testConcurrency(vowpalWabbit.learner.VWLearnersTest)  Time elapsed: 0.001 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class vowpalWabbit.learner.VWLearners
    at vowpalWabbit.learner.VWLearnersTest.testConcurrency(VWLearnersTest.java:99)

testSaveModel(vowpalWabbit.learner.VWLearnersTest)  Time elapsed: 0.001 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class vowpalWabbit.learner.VWLearners
    at vowpalWabbit.learner.VWLearnersTest.testSaveModel(VWLearnersTest.java:78)

testOldModel(vowpalWabbit.learner.VWLearnersTest)  Time elapsed: 0.002 sec  <<< FAILURE!
jackgerrits commented 1 year ago

That path is hardcoded, it doesn't mean it is a linux x64 binary. The test failures show it fails to load the vw_jni lib, so you may need to move the built vw_jni shared object or update the path Java is using to search for the shared object

suming commented 1 year ago

Is there any other way besides running the tests that I can verify that vw_jni is being built successfully? From the build/ directory if I am able to run vowpalwabbit/cli/vw and see :

using no cache
Reading datafile = stdin
num sources = 1
Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
Enabled learners: gd, scorer-identity, count_label

Does this mean that vw_jni works? I did what you suggested and moved classes/natives/linux_64/libvw_jni.dylib to /Library/Java/Extensions (which is what shows up under java -XshowSettings:properties), but when I run cmake --build build -t vw_jni I still see test errors, so it's unclear if vw_jni was built correctly.

jackgerrits commented 1 year ago

Does this mean that vw_jni works?

It's a separate binary so its positive but doesn't necessary mean the Java is all good.

Getting the shared object loaded can be a bit tricky and can be environment dependent so I am not sure how to resolve your loading issue.

jackgerrits commented 1 year ago

I'm going to go ahead and close this one, please feel free to reopen if you are still facing issues or open a new issue