Open ScotterMonk opened 4 years ago
The script uses pg_dump
to get the table's sql. I'm not sure if that command is available for on Windows, you might have to run it on WSL instead.
Thanks! May I ask what WSL is?
Ah a google search shows pg_dump works on windows. Maybe it needs different params or something. I'll try to figure it out.
Whoops, didn't see your edit questions. I'll respond to the ones in the history too:
How does this script work if it doesn't log into PostgreSQL with credentials?
It is designed to run on the same host as the database, so it doesn't really need to authenticate by default. I could work on making it a bit more flexible, but adding a .pgpass
file might be enough to get it to work automatically
Wondering maybe your code's purpose is to write SQL I can execute in PostgreSQL?
Yes, it takes the pg_dump
command and adds some commands to reorganize the table. I did it that way because it should be easier to verify and test, you can look at the commands yourself to see what is going to be run.
Could my issue be that I'm using Python 3.8 with pip updated all libraries your code imports? Should I use a specific version of Python?
Python version shouldn't matter as long as it's >3.6. You might need to install the psycopg2-binary
module instead, I've had more luck installing that module.
Here's some info on WSL, it's a really great tool: https://docs.microsoft.com/en-us/windows/wsl/about
Do you have postgres installed on windows? pg_dump
normally comes bundled with it, but I haven't used the subprocess
module on windows so I'm not 100% sure of its compatibilities.
Thank you for your quick and thoughtful answers! Yes, I am running both Python and my Postgres server on the same Windows 2019 server. psycopg2 is working because I already use it for reads/writes to that same database (inventory) from a Python app I'm building. Oh and I filled in connection string so your app has full admin permissions.
I didn't share with you the run string I was using because it shows up in the error message. But if you need to see anything, say the word and I'll share here. Muchly appreciate your hard work!!!
Oh and please don't bother to look at my edit questions because I edit every time I figure something out or eliminate a potential issue. Here's the run string I have been pasting into cmd prompt in the Python 38 folder: reorder.py -d inventory -n public.tbl_items_to_items id id_session id_child t_part_no t_part_no_parent d_modified b_enabled -f c:\python\python38\logColReorder.txt -m Do you think the file not found error may be coming from trying to write that log file? I created an empty one in the appropriate folder, just in case.
Hope I don't side track you by asking what you think of this: https://stackoverflow.com/questions/60549576/set-environment-variable-pgpassword-before-executing-a-command-pg-dump-on-wi
You can fill in the connection settings, they're on lines 16 and 108. Setting the PGPASSWORD
environment variable is a reasonable way to set the password, I use that method pretty often with projects. You shouldn't need to share it.
The error looks like it's just failing to find the command. What does the console output when you enter where pg_dump
?
Sorry for delay. Was eating.
C:\Python\Python38>where pg_dump INFO: Could not find files for the given pattern(s).
I just ran this:
OK that looks like I need to (and I just did it):
pip install pgdumplib <-- done
Added import pgdumplib to top of your code.
Re-ran and now this error:
C:\Python\Python38>reorder.py -d inventory -n public.tbl_items_to_items id id_session id_child t_part_no t_part_no_parent d_modified b_enabled -f c:\python\python38\logColReorder.txt -m
Traceback (most recent call last):
File "C:\Python\Python38\reorder.py", line 219, in
What are your thoughts on this: https://pypi.org/project/pgdumplib/
Oh and where in your code should I put this (and of course putting my "postgres" user's pw in there): SET PGPASSWORD=secret_pw&&pw_dump AND is that the exact way to write it? AND, heh, is that all that is needed in terms of setting that variable, as in no reference to PGPASSWORD later?
Hey if you want, when I get this finished, happy to send to you so you have a "windows version"!
Oh and I had filled in the database creds at 108 but on 16 I'm unsure how to do so. Seemed obvious at 108 but not on 16.
def get_dump_sql(database: str, schema: str, table: str) -> str:
"""Get SQL that would be returned with `pg_dump`"""
result = subprocess.run(
["pg_dump", "--schema-only", f"--table={schema}.{table}", database],
capture_output=True,
check=True,
)
return result.stdout.decode()
with psycopg2.connect(host="184.175.105.38", database=database, user="postgres", password="mysecret") as conn:
Since it is local maybe I should leave out "host=...."?
Hmm I looked again at pgdumplib and now I"m thinking maybe this is a different, not as well supported pg_dump? Did you look at that page?
pg_dump
is a utility that is normally included with postgres: https://www.postgresql.org/docs/current/app-pgdump.html
It looks like they might already be included, they might be in the bin
folder for your postgresql installation. In that case you'd have to either add the path to your environment, create a symlink, or modify the script to include the executable's path.
Thanks. Looking. I was just reading https://www.postgresql.org/docs/9.1/app-pgdump.html. Will look at the URL you just shared.
Dangit. Both pages on pg_dump give command line examples but not Python examples. Will keep looking.
As to the environment variable, unless you use some tool to set it you'll have to set every time you open a new command prompt.
The subprocess
command is running the command line, the first argument for subprocess.run
is just the command broken into an array (["pg_dump", "--schema-only", f"--table={schema}.{table}", database]
= pg_dump --schema-only --table=schema.table database
)
Yep! pg_dump.exe is in my bin folder! Also, found examples where I need to:
import sh
from sh import pg_dump
from sh import pg_restore
Ah that's helpful. I'll look at it with new eyes now :-)
I didn't know about the sh
module so I didn't use it. At some point I might have to update this to use the feature, it looks fantastic
Sadly now I'm getting errors that say
C:\Python\Python38>reorder.py -d inventory -n tbl_items_to_items id id_session id_child t_part_no t_part_no_parent d_modified b_enabled -f c:\python\python38\logColReorder.txt -m
Traceback (most recent call last):
File "C:\Python\Python38\reorder.py", line 13, in <module>
import array, fcntl, os
ModuleNotFoundError: No module named 'fcntl'
Appears that fcntl is a Unix thing, even though sh is in the windows version and imported fine with pip. I'm starting to wonder if I could ever get this working.
Here's what a guy on stack overflow said:
The fcntl module is just used for locking the pinning file, so assuming you don't try multiple access, this can be an acceptable workaround. Place this module in your sys.path, and it should just work as the official fcntl module.
Place this module in your sys.path, and it should just work as the official fcntl module: Scott: I'm willing to try this but wondering what I name the py file ("fcntl.py"?).
def fcntl(fd, op, arg=0):
return 0
def ioctl(fd, op, arg=0, mutable_flag=True):
if mutable_flag:
return 0
else:
return ""
def flock(fd, op):
return
def lockf(fd, operation, length=0, start=0, whence=0):
return
I think I'm going down the wrong path (rabbit hole) because now it's finding other stuff it wants, like "tty".
I'll look for other solutions to the "change column order" thing, but focused on windows. I appreciate you offering so much of your time to help me today! I'll subscribe to you, for if/when you decide to build a windows-friendly version! THANKS AGAIN!
Just thought of this: I'm wondering if - as you developed this python script - you first (or during) wrote the SQL you wanted it to generate? What I'm getting at is if you could share that raw SQL with me and I bet I can figure out where to change out or add table name and column names?
This looks do-able! Especially given the GUI I use for db management is dBeaver, which will give me SQL for re-creating any of my tables, including foreign keys. A bit more work than your automated tool but at least this will be simple for someone like me, heh!
-- copy data to temporary table
CREATE TEMPORARY TABLE _temp_users (
id INTEGER PRIMARY KEY,
user TEXT NOT NULL,
pass TEXT NOT NULL
);
INSERT INTO _temp_users SELECT
id,
user,
pass
FROM users;
DROP TABLE users;
-- create new table and copy the data back (make changes here)
CREATE TABLE users (
id INTEGER PRIMARY KEY,
user TEXT NOT NULL,
pass TEXT NOT NULL,
beans INTEGER NOT NULL DEFAULT 0 -- just added this
);
INSERT INTO users SELECT
id,
user,
pass,
0 -- added here too
FROM _temp_users;
DROP TABLE _temp_users;
Looks like the output of pg_dump, if I remember correctly DBeaver uses it behind the scenes. I could add a flag to take an input file instead of reading from the command line.
Ah. Yeah in DBeaver it's called "generate SQL" and you can do it with any table, tables, or whole database. I would imagine a common feature of most modern UIs? I don't know because I haven't played with others except for pgAdmin and ... don't tell... Microsoft Access, heh. I never used one for SQL Server.
I added a branch that has the option to read an sql file (either from the output of pg_dump
or the generate SQL
tool in DBeaver). It doesn't have any way to find foreign keys that reference the table, so it doesn't manage those, but otherwise it worked fine in my testing. I haven't merged it yet.
Edit: Oh neat, it looks like you can do this work in dbeaver (https://github.com/dbeaver/dbeaver/issues/9720)! I'm doing it on a larger scale automatically, but for one-off tables it might not be necessary.
First: THANK YOU for making this! It will save me a HUGE amount of time!
Question: Could my issue be that I'm using Python 3.8 with pip updated all libraries your code imports? Should I use a specific version of Python?
ERROR REPORT: C:\Python\Python38>reorder.py -d inventory -n public.tbl_items_to_items id id_session id_child t_part_no t_part_no_parent d_modified b_enabled -f c:\python\python38\logColReorder.txt -m Traceback (most recent call last): File "C:\Python\Python38\reorder.py", line 218, in
main()
File "C:\Python\Python38\lib\site-packages\click\core.py", line 829, in call
return self.main(args, kwargs)
File "C:\Python\Python38\lib\site-packages\click\core.py", line 782, in main
rv = self.invoke(ctx)
File "C:\Python\Python38\lib\site-packages\click\core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
File "C:\Python\Python38\lib\site-packages\click\core.py", line 610, in invoke
return callback(args, *kwargs)
File "C:\Python\Python38\reorder.py", line 198, in main
cols, extras = get_columns(database, schema, table)
File "C:\Python\Python38\reorder.py", line 25, in get_columns
sql_text = get_dump_sql(database, schema, table)
File "C:\Python\Python38\reorder.py", line 15, in get_dump_sql
result = subprocess.run(
File "C:\Python\Python38\lib\subprocess.py", line 489, in run
with Popen(popenargs, **kwargs) as process:
File "C:\Python\Python38\lib\subprocess.py", line 854, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Python\Python38\lib\subprocess.py", line 1307, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified