Error when setting up new database in python 3

AntoniaR commented 1 year ago

When setting up a new database using tkp branch converted_to_python3, the database parameters are not being parsed correctly by SQL Alchemy. (n.b. I have removed my password and the host url from the error below)

Traceback (most recent call last):
  File "/home/antoniar/Python3.6Env/bin/trap-manage.py", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/home/antoniar/TraP_py3/tkp/tkp/bin/trap-manage.py", line 10, in <module>
    tkp.management.main()
  File "/home/antoniar/TraP_py3/tkp/tkp/management.py", line 378, in main
    args.func(args)
  File "/home/antoniar/TraP_py3/tkp/tkp/management.py", line 248, in init_db
    populate(dbconfig)
  File "/home/antoniar/TraP_py3/tkp/tkp/db/sql/populate.py", line 136, in populate
    get_database_config(dbconfig, apply=True)
  File "/home/antoniar/TraP_py3/tkp/tkp/config/__init__.py", line 116, in get_database_config
    tkp.db.Database(**combined)
  File "/home/antoniar/TraP_py3/tkp/tkp/db/database.py", line 95, in __init__
    poolclass=NullPool,
  File "<string>", line 2, in create_engine
  File "/home/antoniar/Python3.6Env/lib/python3.6/site-packages/sqlalchemy/util/deprecations.py", line 309, in warned
    return fn(*args, **kwargs)
  File "/home/antoniar/Python3.6Env/lib/python3.6/site-packages/sqlalchemy/engine/create.py", line 518, in create_engine
    u = _url.make_url(url)
  File "/home/antoniar/Python3.6Env/lib/python3.6/site-packages/sqlalchemy/engine/url.py", line 732, in make_url
    return _parse_url(name_or_url)
  File "/home/antoniar/Python3.6Env/lib/python3.6/site-packages/sqlalchemy/engine/url.py", line 794, in _parse_url
    "Could not parse SQLAlchemy URL from string '%s'" % name
sqlalchemy.exc.ArgumentError: Could not parse SQLAlchemy URL from string ''postgresql'           ;://'antoniar'                      ; for example: 'antoniar':'password'                   ; for example: 'antoniar'@'path.to.host'                      ; for example: 'localhost':5432/'ARpy3test'                   ; for example: 'antoniar''

HannoSpreeuw commented 1 year ago

Thanks. Probably something wrong in the environment, which I also encountered a lot. Before running all unit tests, I ran this bash script - terminal_startup.sh - from within my conda Python 3.6 environment (conda activate py36 in my case):

# Next line probably not needed for the first launch of this script.
rm -rf tkp_data_cluster
killall postgres
killall postgres
killall postgres
initdb -D tkp_data_cluster
pg_ctl -D tkp_data_cluster -l logfile start
createdb tkp_test_db
export TKP_DBNAME=tkp_test_db
trap-manage.py initdb -y

Now the TKP_DBNAME environment variable will be gone if this script is run in the regular way, so please run it as

. ./terminal_startup.sh

I.e. with an extra dot and space. Next, try, from within the tests directory:

python runtests.py -v

Do all tests pass?

AntoniaR commented 1 year ago

Ok, this error needs fixing so that we can install databases for TraP on our systems as normal. In Amsterdam, we use a separate machine to host the TraP databases which interacts directly with our website. Until we have a replacement for the website (longer term goal), we want this system to keep running.

I'm not sure how I can implement this solution on our systems as we do not use the same computer to run the databases. Plus I think having a shell script adds another layer of complexity in running TraP.

I'm also concerned about the killall postgres as I think this would stop all databases on our system. As my students and some collaborators are using postgres databases on the system (through the website), we do not want to be killing these databases.

For these reasons, I strongly recommend dropping the terminal_startup.sh script.

HannoSpreeuw commented 1 year ago

Okay, thanks.

But do all tests pass?

AntoniaR commented 1 year ago

As I cannot set up a database, I cannot run the tests unfortunately.

HannoSpreeuw commented 1 year ago

So you cannot run python runtests.py -v without errors?

AntoniaR commented 1 year ago

That is correct, TraP needs a database to run the tests successfully

HannoSpreeuw commented 1 year ago

But createdb tkp_test_db as part of the shell script should do that.

HannoSpreeuw commented 1 year ago

So my question is really, when you run the shell script and then python runtests.py -v, do all tests pass without error?

AntoniaR commented 1 year ago

Sorry, I thought I had explained above. This shell script is not appropriate for how we run TraP on the Amsterdam systems as we use a separate system to run the database.

HannoSpreeuw commented 1 year ago

I fully understand, but I am trying to establish common ground to pin down the error. So I am trying to find out at what point things start breaking for you.

AntoniaR commented 1 year ago

The computer we run TraP on has some issues when trying to install and run new software. It will likely take a lot of time for me to try to install postgres on that machine and it's not guaranteed that it will run ok. Without postgres on that machine, I think that it is not possible for me to run the shell script as I believe it assumes the database is running on the same machine.

The error is occurring when SQL Alchemy is interpreting the information in the file pipeline.cfg. So I start up a database as normal on our separate system and it fails when I try to connect to it with this key error: sqlalchemy.exc.ArgumentError: Could not parse SQLAlchemy URL from string ''postgresql' ;://'antoniar' ; for example: 'antoniar':'password' ; for example: 'antoniar'@'path.to.host' ; for example: 'localhost':5432/'ARpy3test' ; for example: 'antoniar'' I will attempt to dig further into this error later today.

AntoniaR commented 1 year ago

Ok, I have solved this issue. It was indeed a parsing error when converting from Python 2 to Python 3. For some reason, the ; in the file pipeline.cfg (see below) was causing the strings of the various fields to be garbled with the comment for each parameter. This can be solved by replacing ; with #. I will update the example pipeline.cfg file in the branch converted_to_python3 to reflect this.

engine = 'postgresql'           ;
database = 'ARpy3test'                   ; for example: 'antoniar'
user =  'user'                      ; for example: 'antoniar'
password = 'password'                   ; for example: 'antoniar'
host =  'localhost'                      ; for example: 'localhost'

Should be:

engine = 'postgresql'           #
database = 'ARpy3test'                   # for example: 'antoniar'
user =  'user'                      # for example: 'antoniar'
password = 'password'                   # for example: 'antoniar'
host =  'localhost'                      # for example: 'localhost'

AntoniaR commented 1 year ago

Solved with this code: https://github.com/transientskp/tkp/blob/aa155dc985d88117c7f52c7d7fadc75822a0050d/tkp/config/project_template/pipeline.cfg

HannoSpreeuw commented 1 year ago

Ah, that sounds familiar, I have encountered this a couple of times in fixing the unit tests, e.g. here.

The Python 3 documentation on config files states that anything after a semicolon will be interpreted as a comment, but for reasons I could not determine, this does not happen.

Thanks!

AntoniaR commented 1 year ago

Ah! Yes, I've just encountered the same issue with job_params.cfg. It's annoying that it no longer works, but the fix is good enough.

jdswinbank commented 1 year ago

Hi both —

For the record, the change here occurred in Python 3.2. For details, go to https://docs.python.org/3.10/library/configparser.html#customizing-parser-behaviour and search for inline_comment_prefixes.

You'll note that:

Changed in version 3.2: In previous versions of configparser behaviour matched comment_prefixes=('#',';') and inline_comment_prefixes=(';',).

So the surprising thing here is not that ; stopped working, but that # works at all — it has never been recognized as an inline_comment_prefix by the config parser.

What's actually happening here is some magic in the guts of the TraP that you probably don't want to think about too much (the critical line is here). I think it is surprising and likely confusing in the future to rely on a side-effect of some other code to handle comments in the config file. I'd suggest one of three different options:

If you want to stick with # comments, add the a inline_comment_prefixes=('#',) to your ConfigParser init.
If you prefer ;, use inline_comment_prefixes=(';',) instead;
(Probably best) is not to use inline comments at all, but simply put the comment on the line before the config file entry. This will be least confusing and most robust for the future!

AntoniaR commented 1 year ago

Ah, ok. I will try out the comment on the line above as you recommend.

transientskp / tkp

Error when setting up new database in python 3 #598