MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.34k stars 247 forks source link

2.2.3 #580

Closed decajcd closed 1 year ago

decajcd commented 1 year ago

INFO Finished exporting TextGrids to mfa/output!
Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1116, in _rollback_impl self.engine.dialect.do_rollback(self.connection) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 657, in do_rollback dbapi_connection.rollback() psycopg2.OperationalError: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/xxx/anaconda2/envs/aligner/bin/mfa", line 11, in sys.exit(mfa_cli()) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/click/core.py", line 1128, in call return self.main(*args, kwargs) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/rich_click/rich_group.py", line 21, in main rv = super().main(args, standalone_mode=False, kwargs) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/click/core.py", line 754, in invoke return __callback(args, kwargs) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, *kwargs) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/command_line/align.py", line 158, in align_corpus_cli aligner.cleanup() File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/montreal_forced_aligner/abc.py", line 620, in cleanup sqlalchemy.orm.session.close_all_sessions() File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 4975, in close_all_sessions sess.close() File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 2378, in close self._close_impl(invalidate=False) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 2420, in _close_impl transaction.close(invalidate) File "", line 2, in close File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/orm/state_changes.py", line 137, in _go ret_value = fn(self, arg, **kw) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1327, in close transaction.close() File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2553, in close self._do_close() File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2691, in _do_close self._close_impl() File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2677, in _close_impl self._connection_rollback_impl() File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2669, in _connection_rollback_impl self.connection._rollback_impl() File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1118, in _rollback_impl self._handle_dbapi_exception(e, None, None, None, None) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2325, in _handle_dbapi_exception raise sqlalchemy_exception.with_traceback(exc_info[2]) from e File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1116, in _rollback_impl self.engine.dialect.do_rollback(self.connection) File "/home/xxx/anaconda2/envs/aligner/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 657, in do_rollback dbapi_connection.rollback() sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.

(Background on this error at: https://sqlalche.me/e/20/e3q8)

jadestorm commented 1 year ago

Hi there! I've actually been debugging over the past week or so and wanted to share what I've found in case it helps.

What I am observing is the following:

As an aside, running by default on 5433 means that multiple users on the same machine will conflict -- I'm pretty sure I saw a setting to change one's port. I'd like to include a suggestion of ditching the port and instead setting up a unix socket inside the user's home directory so that you avoid potential conflicts.

Also note I am approaching this from an IT systems perspective, I am very very lightly familiar with what MFA is supposed to "do". =)

If you need any additional information from me please let me know -- happy to provide it.

mmcauliffe commented 1 year ago

Thanks for this, it seems like using sockets should be a better way to go. I've been stuck trying to debug some crashes on GitHub CI with unhelpful logs, so this might help with that.

I'm also thinking of moving the server stop/start out to a separate command rather than wrapping around the core commands to avoid the situation where it doesn't shut down for whatever reason and be a bit clearer on what's going wrong.

jadestorm commented 1 year ago

That sounds like a good plan to me fwiw. =) I did come across this but I don't know if it's useful at all: https://pypi.org/project/postgresqlite/

It's.. obviously not REALLY SQLite-like as it's still running a real PostgreSQL server, but it might be a handy way to accomplish what you are after. That said it doesn't look like it's been updated in a while and is labeled as beta.

lematt1991 commented 1 year ago

Is there a workaround for this? I'm getting this error everytime that I run mfa align with multiprocessing (it seems to go away with --no_use_mp

mmcauliffe commented 1 year ago

Can you try upgrading to 2.2.4 released last night (or verify that you're running it via mfa version)? I changed a bunch of connection related things, incorporating @jadestorm's suggestions. See also the new docs here: https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/server/index.html for server management commands. I kept the default to do auto start/stop of servers just to ease the new user experience, but you can turn that off via mfa configure --disable_auto_server.

lematt1991 commented 1 year ago

Upgrading to 2.2.4 fixed it for me. Thanks!

mmcauliffe commented 1 year ago

Cool, @decajcd, I'll close this out, but feel free to reopen if you're still hitting these errors after upgrading.

jadestorm commented 1 year ago

@mmcauliffe FWIW I was unable to get the auto start/stop working -- and I had to go through a lot of misc steps it seemed to clean things up. Again, I am not that familiar with actually using MFA, so I asked someone who is to give the update a whirl. I DID have success kciking off the mfa server init / start and then stopping it by hand later. (and disabling auto server) However I ran into some other oddness involving something with Kaldi. I also am fairly certain I ended up having to nuke my ~/Documents/MFA directly and start from scratch, but I was also "trying a bunch of things" so some of it may have been unnecessary. I'll be back in touch when I know more, but I definitely saw an improvement in regards to the DB server launch being done manually.

jadestorm commented 1 year ago

ok -- yeah we're both getting a similar issue but it doesn't really seem to be postgres related -- so tomorrow I will gather some info and open a new issue =)

robertfromont commented 1 year ago

@mmcauliffe I'm seeing this error with version 2.2.6 with alignment automation in LaBB-CAT - should I be explicitly starting/stopping this server?

And: what's likely to happen if there's more than one alignment happening at once?

lifeiteng commented 1 year ago

postgres stops me using MFA, so many errors.

 >mfa server init
There was an error encountered starting the global MFA database server, please see /home/feiteng/Documents/MFA/pg_log_global.txt for more details and/or look at the logged errors above.

>cat /home/feiteng/Documents/MFA/pg_log_global.txt
2023-05-12 13:41:59.184 CST [1007567] FATAL:  could not open lock file "/var/run/postgresql/.s.PGSQL.5432.lock": Permission denied
2023-05-12 13:41:59.184 CST [1007567] LOG:  database system is shut down

>sudo chmod 777 /var/run/postgresql
 >mfa server init
2023-05-12 13:43:43.022 CST [1007702] FATAL:  could not open lock file "/var/run/postgresql/.s.PGSQL.5432.lock": Permission denied
2023-05-12 13:43:43.022 CST [1007702] LOG:  database system is shut down

>rm -r /home/feiteng/Documents/MFA/pg_mfa_global/
>sudo rm /var/run/postgresql/.s.PGSQL.5432.lock
>mfa server init
 INFO     Initializing the global MFA database server...
 INFO     Starting the global MFA database server...
waiting for server to start.... done
server started
 INFO     global MFA database server started!

>mfa validate ~/speech/MontrealCorpusTools/Librispeech english_us_arpa english_us_arpa

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not connect to server: No such file or directory
    Is the server running locally and accepting
    connections on Unix domain socket "/home/feiteng/Documents/MFA/pg_mfa_global_socket/.s.PGSQL.5432"?

>touch /home/feiteng/Documents/MFA/pg_mfa_global_socket/.s.PGSQL.5432
>mfa validate ~/speech/MontrealCorpusTools/Librispeech english_us_arpa english_us_arpa

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not connect to server: Connection refused
    Is the server running locally and accepting
    connections on Unix domain socket "/home/feiteng/Documents/MFA/pg_mfa_global_socket/.s.PGSQL.5432"?

why use postgres?