Open clintmiller opened 9 months ago
Hi, couple of questions:
tiny_tds
to the mentioned commit 23ed1e4
before you did the Ruby and Rails upgrade, can you reproduce these issues?Updated above w/ answers- also copied here for convenience, thank you @andyundso.
Against which SQL server do you run the test suite? What kind of of authentication do you run?
Running against 2017 for Linux in a Docker container via Docker Desktop for Mac. SQL Server Authentication. I don't think I'm having an authentication problem as my app (and tests) are able to connect to the database and execute queries.
The crash reports seem from a Mac system. With test automation, do you mean just the test suite of your project or do you run automated tests against Mac on a continuous integration system, like GitHub Actions?
By "test automation", I simply meant the execution of the test suite for this project. As part of this upgrade process, I will get the suite running in an automated CI system, either w/ updates to our existing, but dated Jenkins CI installation, or in GitHub Actions. I have not done this yet.
If you update your tiny_tds to the mentioned commit 23ed1e4 before you did the Ruby and Rails upgrade, can you reproduce these issues?
I will try this against our Rails 5.2 branch, assuming the newer TinyTDS is backwards compatible both w/ Rails 5.2 (should be, I think), and Ruby 2.5.x (??).
Do you run ARM or Intel on your Mac?
This is on an Intel Mac.
Disclaimer: I'm unfamiliar with the C code in tiny_tds
, just trying to help out :).
I also own an Intel Mac, the same OS version as yours, the same FreeTDS version. I maintain two Rails projects, both with RSpec, but one runs Ruby 3.0, the other 3.1. I checked if I also ran into these issues when using the version of tiny_tds
you mentioned in your issue and the SQL server 2017 in Docker. I have run about ten runs on both apps so far with no problems. Are your crashes more infrequent than every tenth run?
In general, from what I read in the code, the nogvl
stuff executes code on a separate Ruby thread but releases the Global VM Lock, as we no longer execute Ruby code, but instead talk with FreeTDS, who then talks to MSSQL. Once the function finishes, it re-acquires the GVL.
EXC_BAD_ACCESS
means we write memory somewhere where we should not. EXC_CRASH
reacts to a SIGABRT
, likely raised of freetds
because of the assertion error that it hits.
Is there some consistency when you hit these errors? For example, I assume you run your test suite in random order. Once it crashes and you re-run the test suite with the same seed, does it crash again with the same error?
As mentioned, it would be interesting to know if you also hit this error when updating tiny_tds
, but run the Ruby 2.5 / Rails 5.2 version of your app. tiny_tds
should work with Ruby 2.5, at least our CI is green :)
similar error when in rails console on Ruby 3.1.2, Rails 7.0, TinyTDS 2.1.5, freetds v1.3.20 on M1 mac running against various MS Sql Server servers e.g. Microsoft SQL Server 2016 (SP3-CU1-GDR) (KB5021128) - 13.0.7024.30 (X64)
usually after I've made a bunch of queries and have had the console open a while, but only on exit
Assertion failed: (conn->in_net_tds == NULL), function tds_free_connection, file mem.c, line 1209.
Abort trap: 6
I was able to reliably trigger this with a single spec example and it's beginning to form some type of working theory for me. This is an excerpt from a Rails controller spec:
describe 'GET finish' do
let(:product_download) { create(:product_download) }
let(:params) do
{ membership: product_download.membership_id, log: product_download.id, bytes: 123456 }
end
it 'closes the product download log log' do
get :finish, params: params
updated_log = ProductDownload.find(product_download.id) # <= This is the boom!
...
end
end
Test output:
Igniter::DownloadController
GET finish
/Users/clintmiller/.asdf/installs/ruby/3.1.4/lib/ruby/gems/3.1.0/gems/activerecord-sqlserver-adapter-6.1.3.0/lib/active_record/connection_adapters/sqlserver/database_statements.rb:463: warning: TinyTds: dbsqlsend() returned FAIL.
Assertion failed: (conn->in_net_tds == NULL), function tds_free_connection, file mem.c, line 1209.
Abort trap: 6
One other important detail is that my application is a multi-tenant app, where the current tenant is a per-request concept that can be switched based on criteria like the hostname requested. One of the things that tenant switching did is switch the database connection with the Apartment gem, which we had forked to support the activerecord-sqlserver-adapter
.
Something must be happening lifetime of these database connections when this spec is run. The database connection being used by the Rails server handling the get :finish
request is closed (the request is over after all?) but is then attempted to be used again by the following line? I'm not sure that's exactly it.
Did and had b/c we've actually removed Apartment and the tenant switching functionality that used it to change the database connections. The tenancy concept is still in place, and requests still switch tenants as before, but we aren't manipulating the database connections during the tenant switch any longer.
@clintmiller have you tried reproducing bundling against main? there may be an unreleased fix
Environment
Operating System
Intel Mac, x86_64
TinyTDS Version and Information
I'm actually pointing at a post-2.1.5 commit on
tiny_tds
:23ed1e4
, where #527 was merged.FreeTDS Version
SQL Server
Running against 2017 for Linux in a Docker container via Docker Desktop for Mac. SQL Server Authentication.
Description
In the course of trying to upgrade an application from Ruby 2.5/Rails 5.2.x to Ruby 3.1/Rails 6.1.x, I have noticed a significant increase in automated test failures (we're using
rspec-rails
). These failures seem take a couple of forms:1. An
EXC_BAD_ACCESS (SIGABRT)
.Terminal output:
OS Crash Report:
I'm pretty sure this was a simple unit test, given the running threads in the Ruby process. One thing that immediately catches my eye is that there appear to be two threads in the TinyTDS C code at the same time. I'm also curious about the
nogvl_*
functions, but only b/c of their name. I'm not sure I fully understand their purpose.2. An
EXC_CRASH (SIGABRT)
.Terminal output:
OS Crash Report:
Obviously, this case with more threads is a system/integration test that started a webserver.
Other Notes
It's interesting to note, that the application boots and runs in my development environment w/o any issues. I'm not sure how it will do with a production workload given the crashes I'm seeing in our test suite.
Next Steps
Making a minimal reproduction test-case would be useful, but I'm not quite sure where to start. On Ruby 2.5/Rails 5.2, our test suite rarely (if ever), crashed in this manner. Is there any other additional information I can provide or debug builds of TinyTDS I could use to reveal more helpful information?