kubo / ruby-oci8

Ruby-oci8 - Oracle interface for ruby
Other
170 stars 75 forks source link

Plans for arm (Mac M1 to be exact)? #257

Closed Physium closed 4 months ago

Physium commented 12 months ago

Curious, are there plans for this gem to support arm? oracle db is now supporting images running on arm architectures already is there still a need for us to toggle between intel vs arm when installing ruby?

kubo commented 12 months ago

Ruby-oci8 doesn't support macOS Arm while Oracle instant client for macOS Arm isn't released. You need to use intel ruby for a while. As for Oracle Database on macOS, it runs on Linux Arm containers. Oracle client in the container isn't available outside of it.

Linux Arm (aarch64) will be supported at the next release. Probably it is at the end of this month. Ruby-oci8 works on Linux arm64.

Physium commented 12 months ago

thanks for your response! Am I able to run ruby-oci8 while using arm based oracle db container?

kubo commented 12 months ago

Yes, you are enable to run Oracle database container as a server and ruby-oci8 for intel macOS as a client on one macOS Arm machine.

kubo commented 5 months ago

Oracle Instant Client for macOS arm64 is out. https://www.oracle.com/database/technologies/instant-client/macos-arm64-downloads.html

I hope it works without ruby-oci8 code change.

matthewtusker commented 5 months ago

I've just installed ruby-oci8 on Mac ARM64! 🎉

EDIT: ~On Ruby 3.2.2, Ruby 3.1.2 fails calling OCIEnvCreate()~ The x86_64 client was still being picked up. Removed it completely and ruby-oci8 installed successfully!

matthewtusker commented 5 months ago

Hmm, looks like there may be an issue:


RuntimeError: Hook error: Could not replace function read in /Users/xxx/instantclient_23_3/libclntsh.dylib.23.1
from /Users/xxx/.local/share/mise/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/ruby-oci8-2.2.11/lib/oci8/properties.rb:74:in `__set_prop'
kubo commented 5 months ago

Thanks for running on macOS!

I updated plthook. Could you try to use the latest revision in the master branch?

Replace ruby-oci8 entry in Gemfile with the following line and run bundle update ruby-oci8 if you use bundler.

gem 'ruby-oci8', :git => 'https://github.com/kubo/ruby-oci8.git', :branch => 'master'
pasha commented 5 months ago

Hi, I was able to install it from the master branch but running it generates following error

dyld[19245]: missing symbol called
matthewtusker commented 5 months ago

Yeah, I get an error too:

RuntimeError: Hook error: unknown imports format 0
from /Users/xxx/.local/share/mise/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/bundler/gems/ruby-oci8-a2ababbc655a/lib/oci8/properties.rb:74:in `__set_prop'
Szemethym commented 5 months ago

Yeah, I get an error too:

RuntimeError: Hook error: unknown imports format 0
from /Users/xxx/.local/share/mise/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/bundler/gems/ruby-oci8-a2ababbc655a/lib/oci8/properties.rb:74:in `__set_prop'

Same here, but commenting out that line makes it work as expected. Only caveat I can see is that you won't have your tcp keepalive time set to 10 minutes as per this doc.

Szemethym commented 5 months ago

I may have spoken too soon...

.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/ruby-oci8-2.2.12/lib/oci8/metadata.rb:2038: [BUG] Segmentation fault at 0x0000000001990028
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin23]
matthewtusker commented 5 months ago

255 and #236 should probably follow along here.

kubo commented 5 months ago

Thanks you all.

I removed OCI8.properties[:tcp_keepalive_time] feature on macOS arm64 by https://github.com/kubo/ruby-oci8/commit/8e67b9bafb5b8ec97db8e65de9f9ea5630796576. After that, it raises NotImplementedError.

Oracle enhanced adapter ignores the exception. https://github.com/rsim/oracle-enhanced/blob/v7.0.3/lib/active_record/connection_adapters/oracle_enhanced/oci_connection.rb#L336-L339

pasha commented 5 months ago

hi, I tried the update and still able to install it but getting the same error as before

bundle exec rake -T
dyld[64294]: missing symbol called
kubo commented 5 months ago
dyld[64294]: missing symbol called

The message was displayed by dyld (dynamic loader). This issue is different with others RuntimeError: Hook error.

After googling, I found https://www.rubyonmac.dev/how-to-fix-missing-symbol-called-when-running-rails-commands.

kubo commented 5 months ago

@Szemethym

.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/ruby-oci8-2.2.12/lib/oci8/metadata.rb:2038: [BUG] Segmentation fault at 0x0000000001990028
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin23]

Could you make a minimal reproducible example? If it reproduces SEGV on Linux, I'll make an effort to resolve it. If not, it may be specific to apple silicon and I may not be able to help you.

Szemethym commented 5 months ago

@kubo I have tried to reproduce the issue, but to no avail... Read: Issue fixed, possibly because of system restart and/or messing around with exports in my terminal source file. Thank you for your assistance, I currently have no issues with arm64 ruby, arm64 macOS Oracle instant client and ruby-oci8 gem on latest master.

Just an FYI: When I was seeing an issue, it was happening only at runtime for a Rails server and I wasn't able to reproduce it in a Rails console, with the same queries. (So, possibly something else was messing around with those memory locations, I'm not sure...)

pilaf commented 4 months ago

I'm also getting a "Hook error", although with a different error message:

(…)/ruby-oci8-2.2.12/lib/oci8/properties.rb:74:in `__set_prop': Hook error: Could not replace function read in /Users/pilaf/Downloads/instantclient_23_3/libclntsh.dylib.23.1 (RuntimeError)
    from (…)/ruby-oci8-2.2.12/lib/oci8/properties.rb:74:in `[]='
        from (…)/activerecord-oracle_enhanced-adapter-6.1.6/lib/active_record/connection_adapters/oracle_enhanced/oci_connection.rb:337:in `new_connection'

I get this when trying to use ruby-oci8 through activerecord 6.x with activerecord-oracle_enhanced-adapter:

irb(main):002:0> MyOracleBackedActiveRecordModel.connection
💥 
ekr1 commented 4 months ago

I have found a pretty interesting crash when running ruby-oci8 in a Rails app on a M3 MBP.

To reproduce minimally, place the 4 files from files.tgz somewhere: .ruby-version, Gemfile, Gemfile.lock, test.rb. Possibly install ruby 2.5.9 with rbenv as usual to match the .ruby-version (I have a suspicion that it would also happen with other rubies, but have not tried it). Run bundle to install any of the gems you don't have yet.

Then edit your Oracle connection into test.rb and run it. It should output one record:

$ bundle exec ./test.rb
#<ActiveRecord::Result:0x00000001311da1f0 @columns=["1"], @rows=[[1]], @hash_rows=nil, @column_types={}>

This is as expected.

Now look at the script. At the top there are some require lines that are commented out.

#!/usr/bin/env ruby

require 'rubygems'

# require 'mail'
# require 'rails'
# require 'mini_portile2'
# require 'sass-rails'

require 'active_record'
require 'ruby-oci8'
...

If you uncomment any one of these, then instead of the previous behaviour, you should get a SIGSEGV:

#!/usr/bin/env ruby

require 'rubygems'

# Any of these will trigger SIGSEGV in ruby-oci8:
require 'mail'
# require 'rails'
# require 'mini_portile2'
# require 'sass-rails'

require 'active_record'
require 'ruby-oci8'

conn=ActiveRecord::Base.establish_connection(:adapter => "oracle_enhanced",
...
$ bundle exec ./test.rb
/Users/myself/.rbenv/versions/2.5.9/lib/ruby/gems/2.5.0/gems/ruby-oci8-2.2.12/lib/oci8/oci8.rb:133: [BUG] Segmentation fault at 0xf9400021f9400000
ruby 2.5.9p229 (2021-04-05 revision 67939) [-darwin23]

-- Crash Report log information --------------------------------------------
   See Crash Report log file under the one of following:                    
     * ~/Library/Logs/DiagnosticReports                                     
     * /Library/Logs/DiagnosticReports                                      
   for more details.                                                        
Don't forget to include the above Crash Report log file in bug reports.     

-- Control frame information -----------------------------------------------
c:0033 p:---- s:0212 e:000211 CFUNC  :attr_set_string
c:0032 p:0288 s:0206 e:000205 METHOD /Users/myself/.rbenv/versions/2.5.9/lib/ruby/gems/2.5.0/gems/ruby-oci8-2.2.12/lib/oci8/oci8.rb:133 [FINISH]
c:0031 p:---- s:0190 e:000189 CFUNC  :new
c:0030 p:0358 s:0182 e:000181 METHOD /Users/myself/.rbenv/versions/2.5.9/lib/ruby/gems/2.5.0/gems/activerecord-oracle_enhanced-adapter-1.8.2/lib/active_re
...

This always happens at oci8.rb:133, which is this line:

    @session_handle.send(:attr_set_string, OCI_ATTR_PASSWORD, password) if password

Funnily, the line immediately previous to that is the same but for the username, and that works.

This also happens when switching ruby-oci8 to the current master in the Gemfile.

Now the funniest bit: if you move all those require below the require 'ruby-oci8', and uncomment all of them, then the SIGSEGV does not happen!

#!/usr/bin/env ruby

require 'rubygems'

require 'active_record'
require 'ruby-oci8'

# No SIGSEGV!
require 'mail'
require 'rails'
require 'mini_portile2'
require 'sass-rails'

conn=ActiveRecord::Base.establish_connection(:adapter => "oracle_enhanced",
...

These four gems are a small subset of those in my app; any other gem in that app does not trigger the problem when required before ruby-oci8, it's only exactly those.

I don't have time to dig deeper at this moment, so will just dump my findings here. Tbh I am also a little bit at wits end of how to debug this further. Presumably those 4 somehow cause some monkey patching or overloading to happen, which then leads to the SIGSEGV in ruby-oci8. Maybe they pull in a common gem via their own dependencies (although looking at the Gemfile.lock there does not seem to be a common set of dependencies, indeed, mini_portile2 has none at all).

The workaround in my real app is to make sure ruby-oci8 is loaded as early as possible; I have simply added its require into config/boot:

ENV['BUNDLE_GEMFILE'] ||= File.expand_path('../Gemfile', __dir__)

require 'bundler/setup' # Set up gems listed in the Gemfile.

# Pull this in before any other gem to avoid SIGSEGV during Oracle
# connection setup on arm64 architecture.
require 'ruby-oci8'

(Similarly in binaries like bin/cucumber - making sure the ruby-oci8 gem is required before bundler/setup.)

For context: this is a pretty old app and has been compiled and run on many environments (several different Linuxes; in CI/CD pipelines; in k8s containers; on an Intel Macbook Pro and on dedicated hardware servers running some weird special Linux). The issue only appeared on a fresh M3 install with the current (as of 07/2024) Oracle instantclient 23.3.0.23.09.

kubo commented 4 months ago

@ekr1 I guess that it is caused by symbol conflict. Two shared libraries have symbols with a same name.

macOS binary files built with default options work fine in the case.

In https://forums.developer.apple.com/forums/thread/715385:

Mach-O uses a two-level namespace. When a Mach-O image imports a symbol, it references the symbol name and the library where it expects to find that symbol.

Ruby C extensions use the two-level namespace.

$ xcrun dyld_info -fixups oci8lib_330.bundle
oci8lib_330.bundle [arm64]:
    -fixups:
        segment      section          address                 type   target
        __DATA_CONST __got            0x00020000              bind  libSystem.B.dylib/___chkstk_darwin
        __DATA_CONST __got            0x00020008              bind  libSystem.B.dylib/___stack_chk_guard
        __DATA_CONST __got            0x00020010              bind  libruby.3.3.dylib/_rb_cFalseClass
        __DATA_CONST __got            0x00020018              bind  libruby.3.3.dylib/_rb_cFloat
        __DATA_CONST __got            0x00020020              bind  libruby.3.3.dylib/_rb_cInteger
        __DATA_CONST __got            0x00020028              bind  libruby.3.3.dylib/_rb_cNilClass
        ...

However this feature is disabled by a linker option -flat_namespace. (It was applied to ruby cext by https://github.com/ruby/ruby/commit/c5eefb7f37db2865891298dd1a1e60dff09560ad but reverted by https://github.com/ruby/ruby/commit/3fb1d49a1f1289142f3da8e876133bd7f459e4f6 soon.) Oracle instant client seems to be built with the option.

$ xcrun dyld_info -fixups libclntsh.dylib
libclntsh.dylib [arm64]:
    -fixups:
        segment      section          address                 type   target
        __DATA_CONST __got            0x02B78000              bind  flat-namespace/_BZ2_blockSort
        __DATA_CONST __got            0x02B78008              bind  flat-namespace/_BZ2_bzCompress
        __DATA_CONST __got            0x02B78010              bind  flat-namespace/_BZ2_bzCompressEnd
        __DATA_CONST __got            0x02B78018              bind  flat-namespace/_BZ2_bzCompressInit
        __DATA_CONST __got            0x02B78020              bind  flat-namespace/_BZ2_bzDecompress
        __DATA_CONST __got            0x02B78028              bind  flat-namespace/_BZ2_bzDecompressEnd
        ...

When libclntsh.dylib tries to call a function in libnnz.dylib or libclntshcore.dylib.23.1 but a function with a same name in another shared library has been loaded by require 'mail' in advance, the latter is incorrectly called in place of the former.

kubo commented 4 months ago

ruby-oci8 2.2.13 was released. It runs on macOS arm.

matthewtusker commented 4 months ago

I'm still seeing infrequent crashes on my machine. I'm not sure if the crashes dump out to a file, but I can share them if I find them.

ekr1 commented 1 week ago

Hey @kubo, just fyi, the case I mentioned above (https://github.com/kubo/ruby-oci8/issues/257#issuecomment-2243273574) is worse now - no fault of ruby-oci8, but after the upgrade to MacOS Sequoia, and no other changes to any of my rbenv, gems or ruby settings, even my smallest test script (which used to work before) crashes with that segmentation fault when calling attr_set_string to set the password.

(fyi, I checked the ruby commits you mentioned, they are for much later/newer ruby versions than the one I'm using.)

Do you, by chance, have an idea whether it is possible to debug this further, i.e. to find out which other library might shadow the symbol from the oracle client libraries? Or do you know whether it's possible to somehow "pin" that library and make it a higher priority, to enforce it to be used?