byzhang / leveldb

Automatically exported from code.google.com/p/leveldb
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

IO error: XXX.sst: Too many open files #175

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I'm using LevelDB in my application with 5 databases. Each database is opened 
with the option max_open_files = 64.

ulimit -Sn shows the operating system has a limit of 1024 files. Setting the 
limit to 2048 fixes the problem. Because I'm distributing this application to 
people, it should have defaults that work out of the box without requiring 
people to configure their operating system.

  leveldb::Status status = db_spends_->Get(leveldb::ReadOptions(), spent_slice, &raw_spend);
  if (!status.ok())
  {
    std::cerr << "fetch_spend: " << status.ToString() << std::endl;
    return false;
  }

I get lots of these errors and cannot read at all.

  "fetch_spend: IO error: XXXX.sst: Too many open files"

There are 5 databases in one subdirectory called "database":

$ ls
addr  block  block_hash  spend  tx
$ du -sh .
16G .
$ du -sh *
2.6G    addr
653M    block
7.2M    block_hash
2.6G    spend
9.4G    tx
$ for i in `ls`; do echo $i; ls $i | wc -l; done
addr
1279
block
333
block_hash
10
spend
1433
tx
5252

I would like to change the 2 MB limit inside LevelDB for each .sst file, but it 
doesn't seem adjustable and I only saw this patch on Google: 
https://github.com/basho/leveldb/pull/7

I'm using Ubuntu 13.04 64bit.

Thanks

Original issue reported on code.google.com by gen...@gmail.com on 9 Jun 2013 at 2:40

GoogleCodeExporter commented 9 years ago
Here is the code I use for opening the databases. If I display the 
open_options.max_open_files before the call to leveldb::DB::Open(), it displays 
64 (as expected).

---------------------------------

bool open_db(const std::string& prefix, const std::string& db_name,
    std::unique_ptr<leveldb::DB>& db, leveldb::Options open_options)
{
    using boost::filesystem::path;
    path db_path = path(prefix) / db_name;
    leveldb::DB* db_base_ptr = nullptr;
    leveldb::Status status =
        leveldb::DB::Open(open_options, db_path.native(), &db_base_ptr);
    if (!status.ok())
    {
        log_fatal(LOG_BLOCKCHAIN) << "Internal error opening '"
            << db_name << "' database: " << status.ToString();
        return false;
    }
    // The cointainer ensures db_base_ptr is now managed.
    db.reset(db_base_ptr);
    return true;
}

...
    // Create comparator for blocks database.
    depth_comparator_.reset(new depth_comparator);
    // Open LevelDB databases
    const size_t cache_size = 1 << 20;
    // block_cache, filter_policy and comparator must be deleted after use!
    open_options_.block_cache = leveldb::NewLRUCache(cache_size / 2);
    open_options_.write_buffer_size = cache_size / 4;
    open_options_.filter_policy = leveldb::NewBloomFilterPolicy(10);
    open_options_.compression = leveldb::kNoCompression;
    open_options_.max_open_files = 64;
    open_options_.create_if_missing = true;
    // The blocks database options needs its depth comparator too.
    leveldb::Options blocks_open_options = open_options_;
    blocks_open_options.comparator = depth_comparator_.get();
    if (!open_db(prefix, "block", db_blocks_, blocks_open_options))
        return false;
    if (!open_db(prefix, "block_hash", db_blocks_hash_, open_options_))
        return false;
    if (!open_db(prefix, "tx", db_txs_, open_options_))
        return false;
    if (!open_db(prefix, "spend", db_spends_, open_options_))
        return false;
    if (!open_db(prefix, "addr", db_address_, open_options_))
        return false;

Original comment by gen...@gmail.com on 9 Jun 2013 at 2:48

GoogleCodeExporter commented 9 years ago
I've even set max_open_files to 20, and I still get the same error on my 
databases.

FATAL: fetch_spend: IO error: database/spend/672223.sst: Too many open files
FATAL: fetch_spend: IO error: database/spend/672223.sst: Too many open files
FATAL: 
fetch_proto_tx(17843f6d6e5ad3d253753f9c0cb75dbd75f37a15871b35369422436d9095c2a6)
: IO error: database/tx/1821552.sst: Too many open files

Original comment by gen...@gmail.com on 9 Jun 2013 at 6:31

GoogleCodeExporter commented 9 years ago
The next version of leveldb will clamp max_open_files to a minimum of 64 
because there is crazy behavior with lower limits. Issue 161 talks some about 
that but I suspect you've already read it.

When you see the too many open files error could you do a $sudo lsof -p <pid> 
to see how many and which files are opened by leveldb?

Original comment by dgrogan@chromium.org on 12 Jun 2013 at 6:20

GoogleCodeExporter commented 9 years ago
Thanks for your prompt answer. I will do that next time I get the errors. I did 
it last time but wasn't able to pipe it to file (running as user) because the 
lsof process wasn't exiting.

Original comment by gen...@gmail.com on 12 Jun 2013 at 9:28

GoogleCodeExporter commented 9 years ago
LevelDB's above a certain size (about 40 GB) seems to cause leveldb to open 
every single file in the database without closing anything in between.

Also, it seems it opens every file twice, for some reason.

Original comment by fullung@gmail.com on 15 Aug 2013 at 3:10

GoogleCodeExporter commented 9 years ago
I also suspect that a large amount of concurrent reads (hundreds of threads) 
causes something to go wrong, which then causes LevelDB to quickly consume up 
to a million file descriptors when one has multiple large databases open.

Original comment by fullung@gmail.com on 15 Aug 2013 at 10:56

GoogleCodeExporter commented 9 years ago
fullung, a standalone reproduction would be sweet. Can you put together a 
little program that creates a database large enough to trigger the issue and 
outputs when I should run lsof to see that too many files have been opened?

Original comment by dgrogan@chromium.org on 20 Aug 2013 at 3:44

GoogleCodeExporter commented 9 years ago
This issue can be connected with issue 
https://code.google.com/p/leveldb/issues/detail?id=219&sort=-id

How to reproduce:
1. Fill up hard disk, where you database opened
2. Try to insert any record into leveldb
3. Leveldb will crash and leave *.sst file in directory (file will be near ~4K)
4. Open database again
5. Try to insert any record again
6. leveldb will crash again
7. You will get moar *.sst files
8. If you will try to open again, leveldb will crash again... and you will get 
a lot of *.sst files... and in some moment leveldb will not open because too 
many *.sst files (and even if you will clear you hard disk and will have enough 
disk space)

Original comment by feniksgo...@gmail.com on 3 Dec 2013 at 3:17

GoogleCodeExporter commented 9 years ago
Any update on this issue?

I'm running into this issue as well on Android (limit of 1024 file 
descriptors). At the point when it reaches the 1024 limit, no write operations 
works any longer until I restart the process (which gives some breathing room 
until it reaches the limit again).

I only have one test code with 2 threads, one thread writing (put one key and 
attempting to delete one previous key) and one thread iterating over the keys.

I was able to run into this error with ~900 keys, each having 1Mb.

Original comment by r...@google.com on 27 Jun 2014 at 9:58

GoogleCodeExporter commented 9 years ago
FWIW, once it reaches this state, only a single thread that keeps writing and 
deleting keys (single put and delete) will eventually run into this "Too many 
open files" error. Iterating or not at the same time does not seem to impact 
the ability to repro the issue.

Original comment by r...@google.com on 27 Jun 2014 at 10:59

GoogleCodeExporter commented 9 years ago
In our case, LevelDb was running in JNI environment, which has a bunch of other 
files open in addition to the default 1000 max_open_files limit, which would 
exceed the 1024 limit on Android.

Setting options.max_open_files to a lower value fixes the issue as it seems to 
be properly honored by TableCache in our tests.

Original comment by r...@google.com on 1 Jul 2014 at 6:30