ClickHouse / clickhouse-rs

Official pure Rust typed client for ClickHouse DB
https://clickhouse.com
Apache License 2.0
324 stars 92 forks source link

Memory Issues with Inserter #25

Closed rtkaratekid closed 12 months ago

rtkaratekid commented 3 years ago

I've been testing clickhouse as a potential db for work and was excited to see a rust clickhouse client. I wrote a small program that randomly generates data that matches our db schema, and then inserts tons of that data into the database with the intent of both getting to know clickhouse better and seeing if it meets out insert and query needs.

Running my test though, I'm coming up against memory errors that look like this issue on the clickhouse repo. I've been trying to troubleshoot it, but I'm just not familiar enough with clickhouse yet to nail down what's causing the issue and what exactly the issue is.

Here's my little test program

#[tokio::main]
async fn main() -> Result<()> {

    let row_size = std::mem::size_of::<DragonflyRow>();
    let bytes_in_billion_rows = 1_000_000_000 * row_size; 

    // insert one billion rows in batches of 10,000
    // I've done this in various batch sizes from 10 to 10,000
    let total_rows_to_insert = 1_000_000_000;
    let batch_size = 10_000;

    // start a clickhouse client
    let client = Client::default().with_url("http://localhost:8123");

    // create an "inserter"
    let mut inserter = client
        .inserter("dragonfly")? // table name
        .with_max_entries(10_000);

    let mut inserted_so_far = 0;
    for i in 1..((total_rows_to_insert/batch_size) + 1) {
        for j in 1..batch_size+1 {
            inserter.write(&DragonflyRow::rand_new()).await?; // the object inserted is a randomly generated/populated struct that matches the db schema
            inserted_so_far = i * j;
        }

        inserter.commit().await?;
        thread::sleep(time::Duration::from_secs(2)); // sleep two seconds to not potentially overwhelm clickhouse

    }

    // close the inserter
    inserter.end().await?;

    Ok(())
}

My table is very simple, no nested objects and the engine is just a MergeTree using a timestamp value to order by.

When I run this with batch sizes of <1,000 rows, I get this error

Error: BadResponse("Code: 33. DB::Exception: Cannot read all data. Bytes read: 582754. Bytes expected: 1838993.: (at row 1)\n: While executing BinaryRowInputFormat. (CANNOT_READ_ALL_DATA) (version 21.11.3.6 (official build))")

When I run this with a batch size of 10,000, I get this

Error: BadResponse("Code: 49. DB::Exception: Too large size (9223372036854775808) passed to allocator. It indicates an error.: While executing BinaryRowInputFormat. (LOGICAL_ERROR) (version 21.11.3.6 (official build))")

Based on the information in the clickhouse issues that are similar to this, I think there's something going on with how the BinaryRowInputFormat queries are being executed, but being newer to clickhouse I'm not very confident that I'm correct about that. Today I hope to follow up by doing a similar test but instead of using this clickhouse client library, I'll just connect to the port and send raw http requests and see if I get the same issues or not. Similar clickhouse issues I've found

I'm on Ubuntu 20.04, 4 cpu cores, 10GB ram, 32GB disk. Looking at htop output while the program is running, I don't see much that helps aside from a lot of clickhouse-server threads (maybe about 50).

For reference, I have no problem inserting a 4GB json file with clickhouse-client --query "<insert statement>" < file.json

I'm happy to help if there are more questions about this issue.

rtkaratekid commented 3 years ago

When trying with insert instead of inserter

    let mut inserted_so_far = 0;
    for i in 1..((total_rows_to_insert/batch_size) + 1) {
        let mut insert = client.insert("dragonfly")?;

        for j in 1..batch_size+1 {
            insert.write(&DragonflyRow::rand_new()).await?;
            inserted_so_far = i * j;
        }
        insert.end().await?;
        thread::sleep(time::Duration::from_secs(2)); // sleep two seconds to not overwhelm clickhouse
    }

The error:

Error: BadResponse("Code: 241. DB::Exception: Memory limit (for query) exceeded: would use 4.00 EiB (attempt to allocate chunk of 4611686018429479655 bytes), maximum: 9.31 GiB: While executing BinaryRowInputFormat. (MEMORY_LIMIT_EXCEEDED) (version 21.11.3.6 (official build))")
loyd commented 3 years ago

Thanks for reporting. Which version of CH do you use (upd: see 21.11.3.6, sorry)? Also, how DragonflyRow looks like or, at least, bytes_in_billion_rows?

I haven't faced it before, even using the inserter up to 300k/s per thread.

rtkaratekid commented 3 years ago

I'm always happy to help/contribute to open source :)

Yeah so I actually just tried one simple insert, which I know has worked in the past

    let client = Client::default().with_url("http://localhost:8123");
    let mut insert = client.insert("dragonfly")?;
    insert.write(&DragonflyRow::rand_new()).await?;
    insert.end().await?;

And actually got an error

DB::Exception: Memory limit (for query) exceeded: would use 64.00 PiB (attempt to allocate chunk of 72057594038284184 bytes), maximum: 9.31 GiB: While executing BinaryRowInputFormat. (MEMORY_LIMIT_EXCEEDED) (version 21.11.3.6 (official build)

This actually makes me suspect that it's clickhouse somehow? Because this single insert has definitely worked in the past. Not 100% what that issue might be though. I just rebuilt my vm, reinstalled clickhouse and created the table (this is not a cluster), so I'm not sure if there's any setting that should be changed from vanilla or not.

Edit to add: when I restart the clickhouse-server and then look at htop, I see and initial entry for one or two processes, but over a few seconds they seem to fork or reproduce and spawn new processes and/or threads. Perhaps that's the issue, but since this is a fresh install I'm not sure why that would happen or if that would be the cause of this issue. I also just re-verified that I can insert data using clickhouse-client.

DragonflyRow looks like this. It also has a new method that just populates the fields with random values.

#[derive(Debug, Row, Serialize, Deserialize)]
struct DragonflyRow {
    timestamp: String,
    alert_id: i64,
    alert_description: String,
    alert_msg: String,
    alert_severity: i64,
    app_protocol: String,
    client_to_server_duration: u64, 
    client_to_server_packet_count: u64,
    client_to_server_bytes: u64,
    client_to_server_tcp_flags: u32, 
    client_to_server_entropy: f32,
    server_to_client_duration: u64, 
    server_to_client_packet_count: u64,
    server_to_client_bytes: u64,
    server_to_client_tcp_flags: u32, 
    server_to_client_entropy: f32,
    community_id: String,
    customer:  String,
    ip_version: u8, 
    dest_hostname: String,
    dest_ip: String,
    dest_port: u16,
    dest_mac: String,
    src_hostname: String,
    src_ip: String,
    src_port: u16,
    src_mac: String,
    kerberos_domain: String,
    kerberos_hostname: String,
    kerberos_username: String,
    web_hostname: String,
    dns_query: String,
    dns_query_type: i64,
    dns_reply_code: i64,
    end_reason: String,
    flow_risk: Vec<u32>, 
    frequency: i64, 
    frequency_id: String, 
    frequency_reason: String, 
    app_protocol_guess: u8, 
    http_content_type: String,
    http_response_code: i16, 
    http_url: String,
    http_user_agent: String,
    model_confidence: f32,
    model_dga_prediction: f32,
    model_discard: u8, 
    model_label_index: u8,
    model_dga: String,
    model_epa: String,
    username: String, 
    passwd: String, 
    payload_lengths: Vec<i64>,
    protocol: String,
    ssh_client_hassh: String,
    ssh_server_hassh: String,
    timeval_diffs: Vec<i64>,
    tls_alpn: String,
    tls_cipher: String,
    tls_cipher_id: i64,
    tls_cipher_level: String,
    tls_ja3c: String,
    tls_ja3s: String,
    tls_not_after: i64,
    tls_not_before: i64,
    tls_cert_hash: String,
    tls_version: String,
    tls_version_id: i64,
    tls_supported_versions: String, 
    pcr_payload: f32, 
    pcr_goodput: f32,
    trusted_domain: u8,
    unpopular_dest: u8,
    vlan: u16
}
18601673727 commented 3 years ago

Today I got the probably same error:

2021-12-04 01:52:01 ERROR project: BadResponse("Code: 49. DB::Exception: Too large size (9223372036854775808) passed to allocator. It indicates an error.: While executing BinaryRowInputFormat. (LOGICAL_ERROR) (version 21.11.4.14 (official build))")

My struct:

pub struct Dn {
    pub id: [u8; 32],
    pub cid: [u8; 32],
    pub did: [u8; 32],
    pub dti: i8,
    pub cti: i8,
    pub t: String,
    pub cn: String,
    pub cs1: String,
    pub cs2: String,
    pub ptd: String,
    pub fr: String,
    pub bi: String,
    pub lgts: String,
    pub lgts_r: String,
    pub lgts_d: String,
    pub co: String,
    pub vdt: String,
    pub jdg: String,
    pub apdx: String,
    pub to: u32,
    pub po: u32,
    pub created_at: u32,
    pub updated_at: u32,
    pub deleted_at: u32,
}
18601673727 commented 3 years ago

Sorry, I figured out my issue is due to unmatched data type between migration and struct, change u32 to u16 solved.

loyd commented 2 years ago

Maybe related https://github.com/ClickHouse/ClickHouse/issues/37420

czzczzczz9898 commented 2 months ago

I face the similiar issue with 22.5.1.2079,when I try to insert I get Memory limit(forr query) exce 64.00 PiB (attempt to allocate chunk of 72057594039063687 bvtes), maximum: 279.40 GiB. But there is no way this insert can use 64pb(I also receive 4 EiB error sometime). Did you solve this issue?