rap2hpoutre / pg-anonymizer

Dump anonymized PostgreSQL database with a NodeJS CLI
https://raph.site
MIT License
223 stars 31 forks source link

strange error after upgrading to 0.7.0 #43

Closed salacr closed 10 months ago

salacr commented 1 year ago

After upgrading to 0.7.0 I encountered an issue:

ERROR: syntax error at or near "List" LINE 1: List: public.message.target_from, public.message.target_to

public.message.target_from, public.message.target_to seams like part of the content of my --configFile which is actually

` public.message.target_from:extension.maskContact public.message.target_to:extension.maskContact

`

after downgrading to 0.6.0 everything works as expected

jackall3n commented 1 year ago

@salacr can you share:

  1. The exact command you're executing
  2. The contents of your extension file
  3. The stack trace of the error you're seeing
salacr commented 1 year ago

Hi sure:

1) npx pg-anonymizer postgres://deploy_stage:secret-passs@stage.example.com/test_qa_reff --extension /anonymization/functions.js --configFile /anonymization/config-messaging -o -

2)

salacr commented 1 year ago

Hi

I will happily provide you a stack trace but I would need some pointers on how to run the command so it will generate one.

Thanks

jackall3n commented 1 year ago

Hi @salacr, sorry I meant to saw output, not stack trace. Just an entire copy of what you input and what was output to the console. If it's easier, can you send a screenshot showing your command being run, and the error you've mentioned.

List: public.message.target_from, public.message.target_to is just some information that's output to the console. It's highlighting what values have been input into the --list flag. In your case the --list flag has been populated by a config file.

ERROR: syntax error at or near "List" suggests that something is trying to "run" the output and throwing on the first error it finds, which in this case is the first word "List". It doesn't explain why downloaded it would fix it.

If you change -o (--output) from -o - to -o output.sql, does it produce a result?

salacr commented 1 year ago

Ok I have new informations: the version 0.6.0 has output as this one:

Launching pg_dump
Command pg_dump started, running anonymization.
Output file: test.sql
Anonymizing table public.administrator
No columns to anonymize
Anonymizing table public.attachment
No columns to anonymize
Anonymizing table public.channel
No columns to anonymize
Anonymizing table public.datalist
No columns to anonymize
Anonymizing table public.datalist_attributes
No columns to anonymize
Anonymizing table public.datalist_request
No columns to anonymize
Anonymizing table public.intechon_sql_tests
No columns to anonymize
Anonymizing table public.message
Columns to anonymize: target_from, target_to, target_from_name, target_to_name
Anonymizing table public.message_attachment
No columns to anonymize
Anonymizing table public.thread
No columns to anonymize
Anonymizing table public.trash
No columns to anonymize

output of 0.7.0 is as this one:

List: public.message.target_from, public.message.target_to, public.message.target_from_name, public.message.target_to_name
Output file: test.sql

Launching pg_dump...
Command pg_dump started, running anonymization.

public.administrator: id, name, uuid, phone, email, active, last_sms_time, last_email_time, sms_time_interval, email_time_interval
Skipping... no matching columns

public.attachment: id, path, name, attributes
Skipping... no matching columns

public.channel: id, name, type, from_address, status, adapter, credential, send_options, strip_tags, primary, career_id, reserve_channel
Skipping... no matching columns

public.datalist: id, name, sql, order_data, where_data, order_default, columns_data
Skipping... no matching columns

public.datalist_attributes: id, datalist_id, result_attributes, scope
Skipping... no matching columns

public.datalist_request: id, datalist_id, datalist_name, request_parameters, parameters, result_attributes, request_user, scope, done, created_at, processing_at
Skipping... no matching columns

public.intechon_sql_tests: id, test_name, test_description, assert_strategy, assert_key, assert_value, compare_operator, query, crontab_expression, notification_recipients, enabled, notification_message_template_token
Skipping... no matching columns

public.message: id, channel_id, thread_id, parent, [target_from], [target_to], priority, subject, content, user_creator_id, creation_time, send_time, read_time, status, [target_from_name], [target_to_name], attachment_type, service_response, target_to_cc, target_to_bcc, type, metadata, counter
Anonymizing 4 columns...

public.message_attachment: id, message_id, attachment_id
Skipping... no matching columns

public.thread: id, owner, entity, entity_id, closed_at, created_at
Skipping... no matching columns

public.trash: id, target_to, host, port, security, auth_mode, username, password, encryption, description, created_at, updated_at, enabled
Skipping... no matching columns

I guess that what is a problem is this lines:

List: public.message.target_from, public.message.target_to, public.message.target_from_name, public.message.target_to_name Output file: test.sql

which are in my case "piped" to psql.

As I'm running the command as this:

npx pg-anonymizer postgres://deploy:****@db2*****/messaging_qa_reff --extension /anonymization/functions.js --configFile /anonymization/config-messaging -o - | PGPASSWORD=***** psql -v ON_ERROR_STOP=1 -h db2****** -U deploy messaging_qa_anonymized

the "List" part is piped to psql and it fails.. I think that it is a bug as even in documentation it's mentioned that

Output can also be stdout ('-') so you can pipe the output to zip, gz, or to psql:

Which is no longer true

jackall3n commented 1 year ago

@salacr thanks for the update, yes you're right, it no longer pipes. Although, I'm surprised the first line Launching pg_dump from 0.6.0 worked before.

I think ideally if -o - is set that it should silence all the logging, or perhaps a --silent flag should be added.

As a quick solution before that's added, could you do something like

npx pg-anonymizer [args] -o output.sql && psql [args] < output.sql

# with your args
npx pg-anonymizer postgres://deploy:****@db2*****/messaging_qa_reff --extension /anonymization/functions.js --configFile /anonymization/config-messaging -o output.sql && PGPASSWORD=***** psql -v ON_ERROR_STOP=1 -h db2****** -U deploy messaging_qa_anonymized < output.sql

Or with cat

npx pg-anonymizer [args] -o output.sql && cat output.sql | psql [args]

# with your args
npx pg-anonymizer postgres://deploy:****@db2*****/messaging_qa_reff --extension /anonymization/functions.js --configFile /anonymization/config-messaging -o output.sql && cat output.sql | PGPASSWORD=***** psql -v ON_ERROR_STOP=1 -h db2****** -U deploy messaging_qa_anonymized
jackall3n commented 1 year ago

I've added two new bit of functionality in the new version (#42) to help cover this.

salacr commented 1 year ago

Sounds great, maybe just one thing I guess that instead of a simple comment -- it might be useful to prefix it with let's say pg-anonymizer ? so it would be something like -- pg-anonymizer: List whatever...

so it's easily distinguished from standard pg_dump comments, but it's only nitpicking..

mike-ovo commented 1 year ago

When this change will be published to npmjs?

jackall3n commented 1 year ago

@mike-ovo the PR is still waiting response from the owner of this repository. I'll try one more time to ask them to deploy, and if they don't I'll release it myself.

rap2hpoutre commented 10 months ago

Fixed via https://github.com/rap2hpoutre/pg-anonymizer/pull/42 thanks to @jackall3n