dimitri / pgloader

Migrate to PostgreSQL in a single command!
http://pgloader.io
Other
5.33k stars 541 forks source link

Couldn't write to #<SB-SYS:FD-STREAM Broken pipe #371

Closed stevetoldme closed 8 years ago

stevetoldme commented 8 years ago
pgloader version "3.3.8fc9a47"
compiled with SBCL 1.3.3
Centos7

All tables get created, things seems to be running fine, but then while creating indexes (some tables have 60 million + rows) everything stops. thread in postgres is in idle, no connections to mysql. with verbose I do get an error sometimes

An unhandled error condition has been signalled:
   Couldn't write to #<SB-SYS:FD-STREAM
                       for "socket 192.168.100.72:42530, peer: 192.168.100.73:3306"
                       {100731E8B3}>:
     Broken pipe

An unhandled error condition has been signalled:
   Couldn't write to #<SB-SYS:FD-STREAM
                       for "socket 192.168.100.72:42530, peer: 192.168.100.73:3306"
                       {100731E8B3}>:
     Broken pipe

here is my current run but I have played with out batching, with larger numbers, and with out anything but the basic LOAD IN FROM info.

LOAD database
  FROM  mysql://user:pass@192.168.100.73/db
  INTO  postgresql://postgres:pass@localhost/db
  WITH  batch rows = 1000,
    batch size = 12MB,
    include drop, create tables, create indexes, foreign keys
  CAST type datetime to timestamptz drop default drop not null using zero-dates-to-null,
       type date drop not null drop default using zero-dates-to-null
  BEFORE LOAD DO $$ CREATE SCHEMA IF NOT EXISTS transaction_52qa; $$

( have tried with no batch, and with rows = 10000 size = 128MB) Boosted max packet size on MySQL and upped connection timeouts as well. Im new to Postgres so not sure what to tweak there.

is 60 million rows just too much?

Thank you.

dimitri commented 8 years ago

We have had reports of cases where pgloader did handle billions of rows without a sweat, so I don't think it's the amount of rows that is a problem here. Do you have any MySQL or PostgreSQL logs related to the broken pipe problem? or some network setting / facility that would be causing the problem?

You may also try pgloader in --debug mode to have more details from pgloader itself...

stevetoldme commented 8 years ago

I did try --debug mode, ran out of memory almost immediately. Ill work on the networking. It doesn't always give me that error. Sometime it just stops in creating an index, just hangs there. Thank you for getting back to me!

dimitri commented 8 years ago

Do you have more data to share to help fix this issue?

About the memory error, you might want to try the CCL build that is now available in Docker, it's easy to reproduce the Dockerfile steps in a local build too:

docker pull dimitri/pgloader:ccl.latest

See Dockerfile.ccl too.

dimitri commented 8 years ago

Closing for lack of information. Please re-open with more information if you still need this fixed and have more information (e.g. a reproducible test-case).

damirm commented 7 years ago

Hi there. I have the same issue. About postgres server: is ipv6 only, and i can connect via psql. Also "production" databases have force check ssl certificate. But this errors reproduces without ssl checks.

docker run --rm -it --network=host dimitri/pgloader pgloader mysql://$MYSQL_USER:$MYSQL_PASS@$MYSQL_HOST:$MYSQL_PORT/$MYSQL_DB postgresql://$PG_USER:$PG_PASS@$PG_HOST:$PG_PORT/$PG_DB
WARNING:
Couldn't re-execute SBCL with proper personality flags (/proc isn't mounted? setuid?)
Trying to continue anyway.
2017-06-05T07:36:42.059000Z LOG Main logs in '/tmp/pgloader/pgloader.log'
2017-06-05T07:36:42.064000Z LOG Data errors in '/tmp/pgloader/'
An unhandled error condition has been signalled:
   Failed to connect to pgsql at "$PG_HOST" (port $PG_PORT) as user "$PG_USER": Ignoring the error which happened while trying to close PostgreSQL socket: Couldn't w
rite to #<SB-SYS:FD-STREAM
                     for "socket 0.0.0.0:0"
                     {10098168E3}>:
      Broken pipe

An unhandled error condition has been signalled:
   Failed to connect to pgsql at "$PG_HOST" (port $PG_PORT) as user "$PG_USER": Ignoring the error which happened while trying to close PostgreSQL socket: Couldn't write to #<SB-SYS:FD-STREAM
                    for "socket 0.0.0.0:0"
                    {10098168E3}>:
      Broken pipe

Then i'm trying to run pgloader:ccl.latest:

docker run --rm -it --network=host dimitri/pgloader:ccl.latest pgloader mysql://$MYSQL_USER:$MYSQL_PASS@$MYSQL_HOST:$MYSQL_PORT/$MYSQL_DB postgresql://$PG_USER:$PG_PASS@$PG_HOST:$PG_PORT/$PG_DB
2017-06-05T08:03:23.003853Z LOG Main logs in '/tmp/pgloader/pgloader.log'
2017-06-05T08:03:23.004166Z LOG Data errors in '/tmp/pgloader/'
2017-06-05T08:03:23.206865Z FATAL An unhandled error condition has been signalled: Failed to connect to pgsql at "$PG_HOST" (port $PG_PORT) as user "$PG_USER": There is no applicable method for the generic function:
  #<STANDARD-GENERIC-FUNCTION CCL::SOCKADDR #x3020003956AF>
when called with arguments:
  (NIL)

Also i've locally installed pgloader via homebrew, and got this error:

pgloader mysql://$MYSQL_USER:$MYSQL_PASS@$MYSQL_HOST:$MYSQL_PORT/$MYSQL_DB postgresql://$PG_USER:$PG_PASS@$PG_HOST:$PG_PORT/$PG_DB
2017-06-05T09:05:03.018000+01:00 LOG Main logs in '/private/tmp/pgloader/pgloader.log'
2017-06-05T09:05:03.021000+01:00 LOG Data errors in '/private/tmp/pgloader/'
2017-06-05T09:05:03.225000+01:00 FATAL An unhandled error condition has been signalled:
   Failed to connect to pgsql at "$PG_HOST" (port $PG_PORT) as user "$PG_USER": Name service error in "gethostbyname": 1 (Unknown host)

If it necessary, i can show the trace with --debug flag. I'm sure that errors happens because of server-specific settings, since it works on local databases, running via docker-compose.

damirm commented 7 years ago

ipv6 doesn't work in "Docker on Mac". This is why errors Failed to connect to pgsql at "$PG_HOST" (port $PG_PORT) as user "$PG_USER": Ignoring the error which happened while trying to close PostgreSQL socket: Couldn't w rite to #<SB-SYS:FD-STREAM happen.

The main issue is error Failed to connect to pgsql at "$PG_HOST" (port $PG_PORT) as user "$PG_USER": Name service error in "gethostbyname": 1 (Unknown host) when pgloader tries load data to ipv6-only postgress server.

dimitri commented 7 years ago

Thanks for digging. So can we close the issue as being a docker problem on MacOSX then?

damirm commented 7 years ago

I don't think so. Problem Name service error in "gethostbyname": 1 (Unknown host) reproduced locally, without docker. The issue just in ip address resolving. You can reproduce it, lets take any fake ipv6-only address, and try to connect via pgloader:

$ pgloader mysql://test:test@ipv6.google.com/test postgresql://test:test@ipv6.google.com/test
2017-06-06T12:11:19.103000+01:00 LOG Main logs in '/private/tmp/pgloader/pgloader.log'
2017-06-06T12:11:19.108000+01:00 LOG Data errors in '/private/tmp/pgloader/'
2017-06-06T12:11:19.108000+01:00 FATAL An unhandled error condition has been signalled:
   Failed to connect to pgsql at "ipv6.google.com" (port 5432) as user "test": Name service error in "gethostbyname": 1 (Unknown host)
dimitri commented 7 years ago

Given https://github.com/marijnh/Postmodern/pull/84/files It looks like the Postmodern driver for PostgreSQL only supports IPv6 with SBCL. So can you try to build from sources with SBCL? It should work with a recent enough version of postmodern installed...

damirm commented 7 years ago

I use pgloader from homebrew, seems that it built with SBCL https://github.com/Homebrew/homebrew-core/blob/master/Formula/pgloader.rb#L15

fahadshery commented 2 years ago

Do you have more data to share to help fix this issue?

About the memory error, you might want to try the CCL build that is now available in Docker, it's easy to reproduce the Dockerfile steps in a local build too:

docker pull dimitri/pgloader:ccl.latest

See Dockerfile.ccl too.

Had massive issues with transferring huge MySQL database. was having running out of heap issues etc. This resolved the issue and was able to transfer the database that took 2 days!

thank you for this. Cheers