ged / ruby-pg

A PostgreSQL client library for Ruby
Other
797 stars 181 forks source link

Is the PG:Connection object thread safe #385

Closed ShadowDaneFoster closed 3 years ago

ShadowDaneFoster commented 3 years ago

First off, I'm sry if this isn't the correct mechanism for asking questions but I couldn't find a forum or mailing list for this project.

I need to ETL some data into PostgreSQL and I would like to use Ruby threads to try to speed up the process so I'm inquiring if PG:Connection is safe for use by multiple threads or will I need a separate Connection object for each thread.

Thanks.

larskanis commented 3 years ago

No, one PG::Connection object can not be used in multiple threads. It even doesn't make much sense, since both the API and the network protocol are stateful. However it is thread safe in that sense, that you can use a separate PG::Connection in each thread concurrently.

You can simply use Thread.current.thread_variable_get / set to store thread local PG::Connection objects. Or you can use ActiveRecord, which takes care about thread local connections automatically. In the latter case the PG::Connection can be retrieved by YourModelObject.connection.raw_connection.

In any case the fastest way to load data into the database is by using PG::Connection#copy_data since it uses a data stream to pipe the records into the database and it has only minimal overhead. But maybe the rails-6 bulk loading API is fast enough in your case.

ShadowDaneFoster commented 3 years ago

Thanks for the detailed response, and you are right, it wouldn't make sense for the connection object itself to be thread safe. I realized that after the fact and wrote the code such that writing to the database has been decoupled from how threads are used.