junegunn / hbase-jruby

A JRuby binding for HBase
MIT License
38 stars 14 forks source link

Possible HTable leaks #29

Closed junegunn closed 10 years ago

junegunn commented 10 years ago

Yes, hbase-jruby internally uses HTablePool with which we take a PooledHTable instance with getTable call, and return it back to the pool with its close method.

However, I felt this manual return of the tables to be error-prone. It is very likely that a user can forget or fail to properly return it, especially in the presence of exceptions.

For example, consider the following code using hbase-jruby:

1000.times do |i|
  hbase[:my_table].put i, 'd:hello' => 'world'
end

If HBase[] call always takes a table instance from the pool, this seemingly harmless code introduces serious leak of the table instances. In that case, in order to be safe, the code has to be written as follows:

my_table = hbase[:my_table]
begin
  1000.times do |i|
    my_table.put i, 'd:hello' => 'world'
  end
ensure
  my_table.close
end

I find doing this to be tedious and error-prone. So I decided to make HBase[] (or HBase#table) call to return a thread-local instance of the table taken from the pool and reuse it ever since so that a thousand calls to hbase[:my_table] will only take one from the pool. If we use a limited number of tables from within a limited number of threads (thread-pools), the number of table instances taken from the pool is limited accordlingly, so we are not likely to have problem.

But I now see you have a point. If we do not reuse threads and keep creating and destroying new ones, not calling close method can lead to a problem. It would be nice if it's possible to automatically return it when a thread holding it is destroyed.

HTable#close call is currently no-op.

  1. Make it close and destroy thread-local HTable instance taken from the pool
  2. Make number 1 automatic, so that a user doesn't have to explicitly call close method
junegunn commented 10 years ago

A simple benchmark shows that the overhead of clean-up is insignificant even with 500 live threads.