Closed prrapo closed 4 years ago
Experiencing the same issue (I think)
2020-05-04 18:05:02.959 UTC [41] STATEMENT: COPY baseballdatabank_parks(park_id, park_name, park_alias, city, state, country) FROM PROGRAM 'zstd --rm -cd /data/baseballdatabank/parks.csv.zst' WITH (FORMAT CSV, FORCE_NULL(park_id, park_name, park_alias, city, state, country));
psql:/docker-entrypoint-initdb.d/postgres_cstore_fdw.sql:527: ERROR: value too long for type character varying(45)
Looked through and found the specific offender for the postgres_cstore issue. The aliases section for Guaranteed Rate Field (White Sox) is 50+ characters long. I tried changing the DDL to varchar(55) then committing a new image and running that but it never got past the parks.csv.zst file load even though it did not send an error message this time. I will add the disclaimer though that I am not super familiar with docker's workings.
Quick update here: I've fixed the bugs above, but there are some Retrosheet duplication issues to deal with. Hoping to have it all ironed out this week.
Should be good now, I ended up having to fork/edit the data manually. Please reopen if you're still having problems.
Tested a few of the new images and three of the containers of were exiting shortly after launching. Removed the image and cleared the cache with each of these three and retried but got same error codes. Outputted their logs, see below, and saw these issues. I only managed to try postgres, postres_cstore, mysql, and clickhouse. Clickhouse worked with no issues. Docker version 19.03.8, build afacb8b running on CentOS Linux release 7.8.2003 (Core). These could probably be separated into two different issues but I'm grouping them together because they're both a result of the 2020 updates.
Postgres Log
2020-05-02 05:15:21.721 UTC [42] ERROR: duplicate key value violates unique constraint "allstar_full_pkey" 2020-05-02 05:15:21.721 UTC [42] DETAIL: Key (player_id, year_id, game_num)=(hermabi01, 1934, 0) already exists. 2020-05-02 05:15:21.721 UTC [42] CONTEXT: COPY allstar_full, line 69 2020-05-02 05:15:21.721 UTC [42] STATEMENT: COPY baseballdatabank.allstar_full(player_id, year_id, game_num, game_id, team_id, lg_id, gp, starting_pos) FROM PROGRAM 'zstd --rm -cd /data/baseballdatabank/allstar_full.csv.zst' WITH (FORMAT CSV, FORCE_NULL(player_id, year_id, game_num, game_id, team_id, lg_id, gp, starting_pos)); psql:/docker-entrypoint-initdb.d/postgres.sql:535: ERROR: duplicate key value violates unique constraint "allstar_full_pkey" DETAIL: Key (player_id, year_id, game_num)=(hermabi01, 1934, 0) already exists. CONTEXT: COPY allstar_full, line 69
MySQL Log
/usr/local/bin/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/mysql.sql ERROR 1062 (23000) at line 536: Duplicate entry 'hermabi01-1934-0' for key 'PRIMARY'
Note: An issue exists on baseballdatabank for this Billy Herman record. He played twice in the game. 🤷♂️ https://github.com/chadwickbureau/baseballdatabank/issues/104
Postgres_cstore Log
/data/baseballdatabank/parks.csv.zst: 11602 bytes 2020-05-02 05:23:59.954 UTC [40] ERROR: value too long for type character varying(45) 2020-05-02 05:23:59.954 UTC [40] STATEMENT: COPY baseballdatabank_parks(park_id, park_name, park_alias, city, state, country) FROM PROGRAM 'zstd --rm -cd /data/baseballdatabank/parks.csv.zst' WITH (FORMAT CSV, FORCE_NULL(park_id, park_name, park_alias, city, state, country)); psql:/docker-entrypoint-initdb.d/postgres_cstore_fdw.sql:527: ERROR: value too long for type character varying(45)