chdb-io / chdb

chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
https://clickhouse.com/docs/en/chdb
Apache License 2.0
2.02k stars 72 forks source link

Load table error #267

Open agoncear-mwb opened 1 week ago

agoncear-mwb commented 1 week ago

(you don't have to strictly follow this form)

Describe the unexpected behaviour When using chdb for inserting data into a table (classic MergeTree table) in the local filesystem, sometimes it give the following error: Code: 722. DB::Exception: Waited job failed: Code: 695. DB::Exception: Load job 'load table default.TEST' failed: Code: 49. DB::Exception: Part all_835_861_2_859 intersects next part all_841_858_2_859. It is a bug or a result of manual intervention in the server or ZooKeeper data: Cannot attach tabledefault.TESTfrom metadata file /tmp/chdb/store/5fa/5faa5bad-0a65-4892-8d70-e9c0e3e6c0e1/TEST.sql from query ATTACH TABLE default.TEST

How to reproduce

Expected behavior Since the engine is used for insert only, i would not expect it to fail in this way. Also, the writer process is single threaded and doesn't have any interference. The bug appear to be pretty random and with no correlation with the load of the inserts.

The insert process, insert a fixed 10240 records at time with async_insert=1 in each query.

Error message and/or stacktrace Code: 722. DB::Exception: Waited job failed: Code: 695. DB::Exception: Load job 'load table default.TEST' failed: Code: 49. DB::Exception: Part all_835_861_2_859 intersects next part all_841_858_2_859. It is a bug or a result of manual intervention in the server or ZooKeeper data: Cannot attach tabledefault.TESTfrom metadata file /tmp/chdb/store/5fa/5faa5bad-0a65-4892-8d70-e9c0e3e6c0e1/TEST.sql from query ATTACH TABLE default.TEST

Additional context Add any other context about the problem here.

auxten commented 5 days ago

Are you using the session mode to do the insert? It will be super helpful if you can show me some code of how you do that

agoncear-mwb commented 5 days ago

Yes, i'm using session mode to insert the data, since i need it for later access.

I'm actually using the go wrapper of chdb, but since the library is dynamically linked you should be able to replicate.

func (s *chdb) FlushEvents(events []*events) error {
    //NOTE: the bucket of events is ALWAYS of 10240 elements, no matter how much real events are in there
    b := sqlbuilder.ClickHouse.NewInsertBuilder().InsertInto(s.fullTableName).SQL(" (*) SETTINGS async_insert=1 ")
    currTime := time.Now().UTC().UnixNano()
    for _, evt := range events {
        if evt == nil {
            continue
        }

        b.Values(
            evt.Accessrights, evt.Accountid, evt.HwId, evt.RuleIdx, evt.EventId,
        )

    }

    q, args := b.Build()
    if len(args) == 0 {
        return nil
    }
    fq, err := sqlbuilder.ClickHouse.Interpolate(q, args)
    if err != nil {
        return err
    }

    _, err = s.dbConnection.Query(fq)
    if err != nil {
        if !(err == sql.ErrNoRows || err.Error() == "result is nil") {
            return err
        }
    }
    return nil
}
agoncear-mwb commented 11 hours ago

@auxten i noticed this log in the same time of the bug:

Code: 82. DB::Exception: Database _temporary_and_external_tables already exists. (DATABASE_ALREADY_EXISTS)