BemiHQ / BemiDB

Postgres read replica optimized for analytics
https://bemidb.com
GNU Affero General Public License v3.0
1.1k stars 21 forks source link

`panic: EOF` when attempting to sync for the first time #2

Closed leighleighleigh closed 2 weeks ago

leighleighleigh commented 3 weeks ago

Description

Running make sync to copy tables from the remote, it immediately panics.

Environment

description command output
Commit git describe --always 6d6689b
Platform uname -a Linux gpu-3 5.15.0-113-generic #123-Ubuntu SMP Mon Jun 10 08:16:17 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Distro lsb_release -d Ubuntu 22.04.4 LTS
PostgreSQL Version psql -c 'SELECT version();' PostgreSQL 15.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 12.3.0, 64-bit

Observed behaviour

leigh@gpu-3:BemiDB$ make sync
devbox run --env-file .env "cd src && go run . sync"
2024/11/08 09:35:49 [INFO] Syncing public.yield_reports...
2024/11/08 09:35:49 [DEBUG] Copied 191 row(s) into /tmp/public.yield_reports1822809755
panic: EOF

goroutine 1 [running]:
main.PanicIfError({0x2b56bc0?, 0x388da20?}, {0x0?, 0xc000187cb0?, 0xc0004078a0?})
    /home/leigh/bemidb/BemiDB/src/utils.go:14 +0x9c
main.(*Syncer).syncFromPgTable(0xc0000124f8, 0xc00042ec60, {{0xc000406ff8?, 0x6?}, {0xc0004078a0?, 0x0?}})
    /home/leigh/bemidb/BemiDB/src/syncer.go:103 +0x305
main.(*Syncer).SyncFromPostgres(0xc0000124f8)
    /home/leigh/bemidb/BemiDB/src/syncer.go:46 +0x1e5
main.syncFromPg(0x392a8c0)
    /home/leigh/bemidb/BemiDB/src/main.go:60 +0x1d
main.main()
    /home/leigh/bemidb/BemiDB/src/main.go:25 +0x9c
exit status 2
Error: error running script "cd src && go run . sync" in Devbox: exit status 1

make: *** [Makefile:17: sync] Error 1

I also noted that the file /tmp/public.yield_reports1822809755 does not exist.

Extra info

The schema of the table being synchronized (public.yield_reports).

                   Table "public.yield_reports"
    Column    │       Type        │ Collation │ Nullable │ Default 
──────────────┼───────────────────┼───────────┼──────────┼─────────
 id           │ bigint            │           │ not null │ 
 harvest_date │ date              │           │          │ 
 house_region │ character varying │           │          │ 
 mass_density │ real              │           │          │ 
 mass_kg      │ bigint            │           │          │ 
 quality      │ character varying │           │          │ 
Indexes:
    "index_yield_reports" btree (harvest_date, house_region) INCLUDE (mass_density, quality)

It also looks like make up is working as intended?

leigh@gpu-3:BemiDB$ make up
devbox run --env-file .env "cd src && go run ."
2024/11/08 09:36:37 [INFO] BemiDB: Listening on 127.0.0.1:54321
2024/11/08 09:36:37 [DEBUG] DuckDB: No init file found at ./init.sql
2024/11/08 09:36:37 [DEBUG] Querying DuckDB: INSTALL iceberg
2024/11/08 09:36:37 [DEBUG] Querying DuckDB: LOAD iceberg
2024/11/08 09:36:37 [DEBUG] Querying DuckDB: CREATE SCHEMA public
2024/11/08 09:36:37 [DEBUG] Querying DuckDB: USE public
2024/11/08 09:36:37 [INFO] DuckDB: Connected
arjunlol commented 3 weeks ago

Thanks for all the detail here @leighleighleigh. We're taking a look!

exAspArk commented 2 weeks ago

Hey @leighleighleigh — thanks a lot for submitting such a detailed issue description!

I tried to test it with the following test table and data:

CREATE TABLE public.yield_reports (
    id BIGINT NOT NULL,
    harvest_date DATE,
    house_region VARCHAR,
    mass_density REAL,
    mass_kg BIGINT,
    quality VARCHAR,
    PRIMARY KEY (id)
);

INSERT INTO public.yield_reports (id, harvest_date, house_region, mass_density, mass_kg, quality) VALUES
(1, '2024-09-15', 'North Region', 0.75, 1500, 'High'),
(2, '2024-09-17', 'South Region', 0.80, 1200, 'Medium'),
(3, '2024-09-20', 'East Region', 0.70, 1800, 'High'),
(4, '2024-09-22', 'West Region', 0.85, 1600, 'Low'),
(5, '2024-09-25', 'Central Region', 0.90, 1400, 'Medium');

^ Syncing this table worked fine for me. My guess is that it is something to do with reading the data itself, not the schema.

Would you be able to pull the latest BemiDB changes, test it on a single row in your public.yield_reports table and share this row after anonymizing any sensitive data?

ShinoharaHaruna commented 2 weeks ago

I encountered the similar issue. I ran bemidb --pg-database-url postgres://postgres:postgres@localhost:5432/dbname sync on my local Dockerized PostgreSQL instance, and experienced the same panic.

❯ bemidb --pg-database-url postgres://postgres:test@localhost:5432/postgres sync  
2024/11/09 18:56:36 [INFO] Syncing public.users...
panic: EOF

goroutine 1 [running]:
main.PanicIfError({0x2b70960?, 0x38b05d0?}, {0x0?, 0xc0000edcb0?, 0xc0003f10c8?})
    /app/utils.go:14 +0x9c
main.(*Syncer).syncFromPgTable(0xc000012540, 0xc0003ebe60, {{0xc0003f1088?, 0x6?}, {0xc0003f10c8?, 0x0?}})
    /app/syncer.go:103 +0x305
main.(*Syncer).SyncFromPostgres(0xc000012540)
    /app/syncer.go:46 +0x1e5
main.syncFromPg(0x394d740)
    /app/main.go:60 +0x1d
main.main()
    /app/main.go:25 +0x9c

The users table is:

CREATE TABLE users (
    id serial4 NOT NULL,
    "name" varchar(100) NOT NULL,
    email varchar(100) NOT NULL,
    created_at timestamp DEFAULT CURRENT_TIMESTAMP NULL,
    CONSTRAINT users_email_key UNIQUE (email),
    CONSTRAINT users_pkey PRIMARY KEY (id)
);
INSERT INTO
    users (name, email)
VALUES
    ('Alice', 'alice@example.com'),
    ('Bob', 'bob@example.com'),
    ('Charlie', 'charlie@example.com');
exAspArk commented 2 weeks ago

I believe this should be resolved in https://github.com/BemiHQ/BemiDB/pull/8 by @sikinatm now. Would you be able to install the latest v0.3.0 version and try again?

leighleighleigh commented 2 weeks ago

Looks to be working, thanks!