Closed chrisjdixon closed 1 year ago
Hi @chrisjdixon ! We use alembic
for creating new columns or changing data types (this will not happen here because of the default value adapt_dtype_of_empty_db_columns=False
).
I'll see if I can reproduce the issue. Most likely having pangres.upsert
add a column to an existing SQL table will do the trick.
As for the solution I can already imagine it will be something like setting the logging level for alembic higher than INFO like in this example.
I need to research on that. I'll keep you updated.
And yes you are right you should not mess with installed code. Here that won't be necessary but in a similar situation where you'd need to tweak a library you could use a fork for instance.
@chrisjdixon I could not reproduce the problem with my current installation (see code I used in "My test" at the bottom of this reply). Indeed, I only got one log output and that was from pangres
.
I think the "problem" really comes from alembic. Do you have a alembic.ini
file? If so you could change the alembic logging to only display levels above INFO
(if that is an acceptable solution for you). See logging levels.
For this you set the level to WARN
in section [logger_alembic]
in the file alembic.ini
. See also documentation on the ini file and stackoverflow subject I posted previously.
Before
[logger_alembic]
level = INFO
After
[logger_alembic]
level = WARN
Alternatively you could avoid altering the ini file and change this after loading the file in Python:
import alembic.config
import alembic.command
alembic_cfg = alembic.config.Config('alembic.ini')
alembic_cfg.set_section_option("logger_alembic", "level", "WARN")
alembic.command.upgrade(alembic_cfg, 'head')
I don't have any experience with alembic apart from what I did with pangres so it's a bit difficult for me to help you :frowning_face:.
Hi @ThibTrip ! When including the logging settings my project uses in your example the issue can be replicated:
from pangres import upsert, DocsExampleTable
from sqlalchemy import create_engine
import logging
logging.basicConfig(filename='alembic_logging.log', level=logging.INFO, format = '%(asctime)s - %(message)s', datefmt="%Y-%m-%d %H:%M:%S")
# config
table_name = 'example'
connection_string = "sqlite:///:memory:"
engine = create_engine(connection_string)
# get a first df we will use to create the SQL table
# and a second df we will use to test the addition of new columns
df = DocsExampleTable.df
df2 = DocsExampleTable.new_df.assign(test=True)
# upsert
upsert(con=engine, df=df, table_name=table_name, if_row_exists='update')
upsert(con=engine, df=df2, table_name=table_name, if_row_exists='update', create_table=False, add_new_columns=True)
Are these logging settings different to what you'd expect to be used when using Pangres? Perhaps I'm doing something poorly? I'm guessing the issue is likely caused by me doing something silly 😅.
I'm embarrassed to say I'm not a very good programmer and I don't understand what you mean by "avoid altering the ini file and change this after loading the file in Python". Where would I put that code? When trying to include it in the above script I'm thrown the error "alembic.util.exc.CommandError: No config file 'alembic.ini' found, or file has no '[alembic]' section".
Thank you very much for the help!
@chrisjdixon just saw your reply. I tried your code and unfortunately it issues no logging for alembic
:neutral_face: for me. I only get the same line of log as in my previous test and it is from pangres
.
And no the logging config here is completely normal. Sorry but I think you should address that to alembic
. I think you could use their google mailing list ("If you get stuck, contact us on the alembic-discussion mailing list").
The error alembic.util.exc.CommandError: No config file 'alembic.ini' found, or file has no '[alembic]' section
is to be expected if you do not have a alembic.ini
file (if you actually do have one check if it is properly formatted).
In this case I don't see why you'd have a different result than me :thinking: .
That is very strange... surely there's some difference! I've ran the above code verbatim to check and it still creates the log. Unsure if relevant but I'm running VS Code as admin, WSL, latest packages, venv 3.8.10. The code seems to be run by the init
of the class MigrationContext
, but I don't understand classes and wouldn't know where to go from here...
These logs aren't causing me any big problems - it's just an annoyance causing clutter. Would love to know if you have any more ideas!
Sorry @chrisjdixon I forgot to get back to you! I was busy with work and then on holidays.
I tried venv
and was able to reproduce the problem. While the log entries you described do not appear on the console they do appear in the log file.
I put the code you posted on June 6th in a file ./test_pangres/test.py
then ran these commands with python 3.8.13
:
$ cd test_pangres
$ python -m venv .
$ source ./bin/activate
(test_pangres) $ python test.py
# console
2022-07-19 14:45:49,793 | INFO | pangres | logger:log:73 - Added column example.test (type: BOOLEAN) in table example (schema="None")
# alembic_logging.log
2022-07-19 14:45:36 - Context impl SQLiteImpl.
2022-07-19 14:45:36 - Will assume non-transactional DDL.
2022-07-19 14:45:36 - Added column example.test (type: BOOLEAN) in table example (schema="None")
I was able to remove the two first lines in alembic_logging.log
by changing the logging level in the basic configuration to only WARNING
. I suppose that alembic detects that there is a logging configuration and hooks itself to it.
logging.basicConfig(filename='alembic_logging.log', level=logging.INFO, format = '%(asctime)s - %(message)s', datefmt="%Y-%m-%d %H:%M:%S")
logging.basicConfig(filename='alembic_logging.log', level=logging.WARNING, format = '%(asctime)s - %(message)s', datefmt="%Y-%m-%d %H:%M:%S")
@chrisjdixon I am doing some maintenance with my library and I think I can consider this closed with my last comment. Feel free to reopen this issue if there is anything else.
Our log files have been flooded with undesired log entries that occur with each Pangres upsert that otherwise went perfectly:
Was working fine for months but suddenly they're everywhere. Think it might be caused by updating packages but not sure.
Had a search around and couldn't find anything, but eventually found the culprit in /venv/lib64/python3.8/site-packages/alembic/runtime/migration.py on lines 201 onwards:
Our pangres code:
I'm not a good programmer and don't want to risk bigger problems by messing with packages' code... is there anything we can do to disable this?
Also, we're unsure if it's most appropriate to ask about this here or over at Alembic. As it's a Pangres function that ultimately causes it we thought Pangres most responsible but feel free to redirect us elsewhere if appropriate.