usc-isi-i2 / datamart-api

MIT License
1 stars 2 forks source link

Sometimes posting large dataset twice causes datamart to throw exception #71

Open kyao opened 4 years ago

kyao commented 4 years ago

Tried posting twice the FSIall_AN dataset in performance_test branch test/test_data/FSI_all_Annotated.xlsx file.

The exception occurs when datamart tries to delete the previous post 50,000 edge ids at a time.

[2020-10-14 10:51:10,916] ERROR in app: Exception on /datasets/FSIall_AN/annotated [POST]
Traceback (most recent call last):
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1284, in _execute_context
    cursor, statement, parameters, context
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
    cursor.execute(statement, parameters)
psycopg2.OperationalError: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/flask_restful/__init__.py", line 468, in wrapper
    resp = resource(*args, **kwargs)
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/flask/views.py", line 89, in view
    return self.dispatch_request(*args, **kwargs)
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/flask_restful/__init__.py", line 583, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/home/ktyao/dev/dsbox/datamart-api/api/annotated/main.py", line 8, in post
    return post_data.process(dataset)
  File "/home/ktyao/dev/dsbox/datamart-api/api/annotated/post.py", line 109, in process
    import_kgtk_dataframe(kgtk_exploded_df, is_file_exploded=True)
  File "/home/ktyao/dev/dsbox/datamart-api/db/sql/kgtk.py", line 175, in import_kgtk_dataframe
    import_kgtk_tsv(tsv_path, config=config)
  File "/home/ktyao/dev/dsbox/datamart-api/db/sql/kgtk.py", line 153, in import_kgtk_tsv
    session.execute(delete_q)
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1292, in execute
    clause, params or {}
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1020, in execute
    return meth(self, multiparams, params)
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_clauseelement
    distilled_params,
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1324, in _execute_context
    e, statement, parameters, cursor, context
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1518, in _handle_dbapi_exception
    sqlalchemy_exception, with_traceback=exc_info[2], from_=e
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
    raise exception
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1284, in _execute_context
    cursor, statement, parameters, context
  File "/home/ktyao/anaconda3/envs/datamart/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
(Background on this error at: http://sqlalche.me/e/e3q8)
Exception ignored in: <module 'threading' from '/home/ktyao/anaconda3/envs/datamart/lib/python3.7/threading.py'>

Here is part of the sql statement from the exception. It's too long include the whole thing.

image

zmbq commented 4 years ago

Posting the same annotated file doesn't seem to delete it, it seems to try and add the new edges (this seems intentional. I didn't see how I can get this delete code to work.

Deleting the dataset from its own request does work rather quickly. It doesn't delete by edge id, though. I couldn't find any code that DELETEs by edge id, or uses SQL Alchemy to delete objects.

saggu commented 4 years ago

@zmbq if you do a PUT instead of POST the existing data for the variables will be first deleted.

But the issue posted by @kyao is not about that.

What is the purpose of this line ?

https://github.com/usc-isi-i2/datamart-api/blob/development/db/sql/kgtk.py#L153