scylladb / scylladb

NoSQL data store using the seastar framework, compatible with Apache Cassandra
http://scylladb.com
GNU Affero General Public License v3.0
13.23k stars 1.25k forks source link

Load and Stream issue with Memory Allocation #16067

Open sjoshi10 opened 10 months ago

sjoshi10 commented 10 months ago

Running into issues while trying to load one of the tables. I’ve retried it multiple times by dropping the table and still getting same error. Not sure what is causing the issue. Other table works fine but this one table is giving us an issue.

Installation details

Scylla version :  5.2.9-0.20230920.5709d0043978
Cluster size: 6
Ubuntu 22.06

Hardware details (for performance issues)

24 CPU 
502GB RAM 

This is the error I'm getting:

Nov 13 16:25:26 scylladb-02 scylla[3870]:  [shard 14] storage_proxy - Exception when communicating with 10.31.13.102, to read from system.batchlog: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:26 scylladb-02 scylla[3870]:  [shard  0] batchlog_manager - Exception in batch replay: exceptions::read_failure_exception (Operation failed for system.batchlog - received 0 responses and 1 failures from 1 CL=ONE.)
Nov 13 16:25:26 scylladb-02 scylla[3870]:  [shard  0] migration_manager - Requesting schema pull from 10.31.2.110:0
Nov 13 16:25:26 scylladb-02 scylla[3870]:  [shard  0] migration_manager - Pulling schema from 10.31.2.110:0
Nov 13 16:25:26 scylladb-02 scylla[3870]:  [shard  0] schema_tables - Schema version changed to f36da9c7-de1c-3909-a119-08f591baa675
Nov 13 16:25:26 scylladb-02 scylla[3870]:  [shard  0] migration_manager - Schema merge with 10.31.2.110:0 completed
Nov 13 16:25:26 scylladb-02 scylla[3870]:  [shard  0] mutation_reader - shard_reader::close(): read_ahead on shard 46 failed: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:27 scylladb-02 scylla[3870]:  [shard  0] mutation_reader - shard_reader::close(): read_ahead on shard 46 failed: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:27 scylladb-02 scylla[3870]:  [shard  0] mutation_reader - shard_reader::close(): read_ahead on shard 46 failed: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:27 scylladb-02 scylla[3870]:  [shard  0] mutation_reader - shard_reader::close(): read_ahead on shard 46 failed: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:27 scylladb-02 scylla[3870]:  [shard  0] storage_proxy - Exception when communicating with 10.31.13.102, to read from system_distributed.service_levels: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:34 scylladb-02 scylla[3870]:  [shard  0] mutation_reader - shard_reader::close(): read_ahead on shard 46 failed: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:34 scylladb-02 scylla[3870]:  [shard  0] mutation_reader - shard_reader::close(): read_ahead on shard 46 failed: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:37 scylladb-02 scylla[3870]:  [shard  0] migration_manager - Requesting schema pull from 10.31.2.110:0
Nov 13 16:25:37 scylladb-02 scylla[3870]:  [shard  0] migration_manager - Pulling schema from 10.31.2.110:0
Nov 13 16:25:37 scylladb-02 scylla[3870]:  [shard  0] schema_tables - Schema version changed to f36da9c7-de1c-3909-a119-08f591baa675
Nov 13 16:25:37 scylladb-02 scylla[3870]:  [shard  0] migration_manager - Schema merge with 10.31.2.110:0 completed
Nov 13 16:25:37 scylladb-02 scylla[3870]:  [shard  0] mutation_reader - shard_reader::close(): read_ahead on shard 46 failed: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:37 scylladb-02 scylla[3870]:  [shard  0] mutation_reader - shard_reader::close(): read_ahead on shard 46 failed: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:37 scylladb-02 scylla[3870]:  [shard  0] mutation_reader - shard_reader::close(): read_ahead on shard 46 failed: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:37 scylladb-02 scylla[3870]:  [shard  0] mutation_reader - shard_reader::close(): read_ahead on shard 46 failed: std::bad_alloc (std::bad_alloc)
Nov 13 16:25:37 scylladb-02 scylla[3870]:  [shard  0] storage_proxy - Exception when communicating with 10.31.13.102, to read from system_distributed.service_levels: std::bad_alloc (std::bad_alloc)
avikivity commented 10 months ago

@asias do we have a concurrency problem with load_and_stream?

asias commented 10 months ago

We load and process only 16 sstables per shard at a time.

asias commented 10 months ago

502GB / 24 core / 2 threads = 10 GB per shard

asias commented 10 months ago

@sjoshi10 you can try move less tables at a time run load and stream. Let it finish. Repeat until all files are processed.

sjoshi10 commented 10 months ago

@asias is there an option to do that? because I basically have 10TB of data from snapshot in upload directory. This snapshot was created from another cluster.

asias commented 10 months ago

You can do it in a simple script:

for file in files: cp file to scylla upload run load and stream

sjoshi10 commented 10 months ago

Alright, I'll give this a try and will let you know if I run into same issue.

sjoshi10 commented 10 months ago

When i try to load some of the files instead of all, I get errors like this:

md-195655-big-CompressionInfo.db: file not found)

This file belongs in directory I'm copying from. Is there a better way to copy files. Seems like there are dependencies for files.

avikivity commented 10 months ago

But @asias we want to fix the bug too, no t just work around it.

raphaelsc commented 10 months ago

I remember that we were worried about having to give up some optimizations, like the one I introduced, that sorts sstables by their first token. For that, we can do something like reading the minimum from disk in order to find their first token (from the top of my head, we can just skip to the end of summary file and retrieve first key), then we perform sorting on file names by first token, then proceed to open them incrementally taking into account target desired concurrency

avikivity commented 10 months ago

We're talking about bad_alloc here, not optimizations.

raphaelsc commented 10 months ago

We're talking about bad_alloc here, not optimizations.

I was of course talking about bad alloc too, and I think you will agree with me it's better to fix a problem without losing a good existing optimization. The fix of limiting the number of sstables load and stream work with at any point in time has to be done exactly as I suggested to not lose the optimization so I am leaving this as an instruction to whoever gets to work on it.

asias commented 10 months ago

But @asias we want to fix the bug too, no t just work around it.

Of course, the workaround helps to understand the problem too, in addition to help the user to move forward instead of waiting for long time before there is a solution.

mykaul commented 8 months ago

ping @asias for next steps here.

asias commented 8 months ago

When i try to load some of the files instead of all, I get errors like this:

md-195655-big-CompressionInfo.db: file not found)

This file belongs in directory I'm copying from. Is there a better way to copy files. Seems like there are dependencies for files.

Yes. There are multiple component files for a given sstable. We need to copy all of the component files for the given sstable. E.g., you can run: cp md-195655-big* my_dst_dir to copy the sstable components.

asias commented 8 months ago

ping @asias for next steps here.

We need a reproducer and there are few details in the report. I suspect it is a generic issue when loading too many sstables into memory from upload directory. The sstable loading is before we do load and stream. The load and stream only process 16 sstables at a time, I do not think this is going to cause too much memory pressures.