stripe-archive / mosql

MongoDB → PostgreSQL streaming replication
MIT License
1.63k stars 225 forks source link

Can't import BinData fields? #64

Closed spazm closed 10 years ago

spazm commented 10 years ago

I have binary fields in my mongo db, BinData(3,"...") and BinData(0,"..."). Is there any way to import these with mosql? I see encoding issues in the COPY from STDIN command.

example record:

> db.entities.findOne();
{
        "_id" : BinData(3,"DHSvnWWKRQ6hydIk/B4RaQ=="),
        "owners" : [
                BinData(3,"Y1nPwQcVSMuiS1IS3IXCyA=="),
                BinData(3,"IUec/mYQSO6ctTpo2ilU2g==")
        ],
        "display_name" : "SportsOnEarth",
        "modified_time" : 1407781851.226008,
        "created_time" : 1406242278.356365,
        "handle" : "sportsonearth",
        "type" : "organization",
        "slug" : "DHSvnWWKRQ6hydIk_B4RaQ"
}

collections.yml

db2:
  entities:
    :columns:
    - slug: TEXT
    - id:
      :source: _id
      :type: BYTEA
    :meta:
      :table: entities
      :extra_props: false

snippet of error:

[vagrant@air64] 1105% mosql --skip-tail -v                                                                      :) (git)-[orm] (env) ~s/airtv/airtv/mosql
 INFO MoSQL: Creating table 'entities'...
 INFO MoSQL: Mongd DB 'admin' not found in config file. Skipping.
 INFO MoSQL: Importing for Mongo DB db2...
 INFO MoSQL: Importing for db2.entities...
DEBUG MoSQL: Transformed: ["DHSvnWWKRQ6hydIk_B4RaQ", "\ft\257\235e\212E\016\241\311\322$\374\036\021i"]
DEBUG MoSQL: Transformed: ["Fa8CsZ5yTd6GaC4WLajFpg", "\025\257\002\261\236rM\336\206h.\026-\250\305\246"]
DEBUG MoSQL: Transformed: ["IJxB5N_DRF-s6myLffWfcQ", " \234A\344\337\303D_\254\352l\213}\365\237q"]
DEBUG MoSQL: Transformed: ["IUec_mYQSO6ctTpo2ilU2g", "!G\234\376f\020H\356\234\265:h\332)T\332"]

...

DEBUG MoSQL: Transformed: ["9Gc5D8a5SU6BBw47V2jdgw", "\364g9\017\306\271IN\201\a\016;Wh\335\203"]
DEBUG MoSQL: Transformed: ["9ZIeR9H4SPGFbBnG1uYaSw", "\365\222\036G\321\370H\361\205l\031\306\326\346\032K"]
DEBUG MoSQL: Transformed: ["9n-Ljn2CTY21WIuMEpsUFQ", "\366\177\213\216}\202M\215\265X\213\214\022\233\024\025"]
DEBUG MoSQL: Bulk insert error (PG::CharacterNotInRepertoire: ERROR:  invalid byte sequence for encoding "UTF8": 0xaf
CONTEXT:  COPY entities, line 1
), attempting invidual upserts...
ERROR MoSQL: Error processing {"slug"=>"DHSvnWWKRQ6hydIk_B4RaQ", "id"=>"\ft\257\235e\212E\016\241\311\322$\374\036\021i"} for db2.entities.
/usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:161:in `async_exec': PG::CharacterNotInRepertoire: ERROR:  invalid byte sequence
 for encoding "UTF8": 0xaf (Sequel::DatabaseError)

        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:161:in `execute_query'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/database/logging.rb:33:in `log_yield'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:161:in `execute_query'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:148:in `execute'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:124:in `check_disconnect_errors'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:148:in `execute'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:492:in `_execute'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:316:in `execute'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:513:in `check_database_errors'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:316:in `execute'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/database/connecting.rb:250:in `synchronize'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/connection_pool/threaded.rb:104:in `hold'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/database/connecting.rb:250:in `synchronize'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/adapters/postgres.rb:316:in `execute'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/database/query.rb:50:in `execute_dui'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/dataset/actions.rb:917:in `execute_dui'
        from /usr/lib/ruby/gems/1.8/gems/sequel-4.13.0/lib/sequel/dataset/actions.rb:773:in `update'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/sql.rb:51:in `upsert!'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:61:in `bulk_upsert'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:39:in `unsafe_handle_exceptions'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:60:in `bulk_upsert'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:57:in `each'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:57:in `bulk_upsert'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:162:in `import_collection'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:119:in `initial_import'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:117:in `each'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:117:in `initial_import'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:105:in `each'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:105:in `initial_import'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/streamer.rb:28:in `import'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/cli.rb:162:in `run'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/lib/mosql/cli.rb:16:in `run'
        from /usr/lib/ruby/gems/1.8/gems/mosql-0.3.2/bin/mosql:5
        from /usr/bin/mosql:19:in `load'
        from /usr/bin/mosql:19