LuteOrg / lute-v3

LUTE = Learning Using Texts: learn languages through reading. Python/Flask.
MIT License
394 stars 44 forks source link

Term Import IntegrityError #454

Closed cblanken closed 1 week ago

cblanken commented 1 month ago

Description Importing terms with the attached .csv file causes an IntegrityError. See import.csv

Here is a stack trace of the error. There error occurs as of commit 2b2911ea869822468ff7fce9e38fd742796ff0a4 on the develop branch.

Traceback (most recent call last):
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2108, in _exec_insertmany_context
    dialect.do_execute(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
    cursor.execute(statement, parameters)
sqlite3.IntegrityError: UNIQUE constraint failed: tags.TgText

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/flask/app.py", line 2213, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/flask/app.py", line 2193, in wsgi_app
    response = self.handle_exception(e)
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/flask/app.py", line 2190, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/flask/app.py", line 1486, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/user/lute-dev/lute/termimport/routes.py", line 38, in term_import_index
    stats = import_file(
  File "/home/user/lute-dev/lute/termimport/service.py", line 31, in import_file
    return _do_import(import_data, create_terms, update_terms, new_as_unknowns)
  File "/home/user/lute-dev/lute/termimport/service.py", line 263, in _do_import
    t = repo.find(lang.id, hsh["term"])
  File "/home/user/lute-dev/lute/term/model.py", line 124, in find
    spec = self._search_spec_term(langid, text)
  File "/home/user/lute-dev/lute/term/model.py", line 271, in _search_spec_term
    lang = Language.find(langid)
  File "/home/user/lute-dev/lute/models/language.py", line 173, in find
    return db.session.query(Language).filter(Language.id == language_id).first()
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 2743, in first
    return self.limit(1)._iter().first()  # type: ignore
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 2842, in _iter
    result: Union[ScalarResult[_T], Result[_T]] = self.session.execute(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2262, in execute
    return self._execute_internal(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2123, in _execute_internal
    ) = compile_state_cls.orm_pre_session_exec(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/context.py", line 551, in orm_pre_session_exec
    session._autoflush()
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2939, in _autoflush
    raise e.with_traceback(sys.exc_info()[2])
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2928, in _autoflush
    self.flush()
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 4179, in flush
    self._flush(objects)
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 4315, in _flush
    transaction.rollback(_capture_exception=True)
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 4275, in _flush
    flush_context.execute()
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
    rec.execute(self)
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 642, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 93, in save_obj
    _emit_insert_statements(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 1136, in _emit_insert_statements
    result = connection.execute(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1412, in execute
    return meth(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1635, in _execute_clauseelement
    ret = self._execute_context(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1839, in _execute_context
    return self._exec_insertmany_context(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2116, in _exec_insertmany_context
    self._handle_dbapi_exception(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2339, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2108, in _exec_insertmany_context
    dialect.do_execute(
  File "/home/user/lute-dev/.venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)
(sqlite3.IntegrityError) UNIQUE constraint failed: tags.TgText
[SQL: INSERT INTO tags ("TgText", "TgComment") VALUES (?, ?) RETURNING "TgID"]
[parameters: ('pron', '')]
(Background on this error at: https://sqlalche.me/e/20/gkpj)

To Reproduce The error can be reproduced by importing the terms on the default install.

Extra software info, if not already included in the Description:

cblanken commented 1 month ago

Okay, it turns out this was just a few malformed lines in the input csv. However I think it'd still be a good idea to add some error handling and relevant tests to prevent the user getting an unhelpful Internal Server Error message like this.

image

One of the culprits seemed to be line 673 among others with a single field accidentally split into many comma-separated values.

German,dat,"→ Alternative form of das
→ Alternative form of das
→ it
→ Alternative form of dass
",,"kaikki-import,a,r,t,i,c,l,e,pron,pron,conj",IPA: /dat/ or /dɐt/ or /dət/

Note, the a,r,t,i,c,l,e should be article

jzohrab commented 3 weeks ago

Right, I thought I'd handled all the bad cases but obviously not! Thank you for the ticket with the cause, this should be an easy test case case and fix.

jzohrab commented 1 week ago

This may have been due to the duplication of pron in the list of tags ... I just added a simple test with a file like below

            language,translation,term,parent,status,tags,pronunciation
            Spanish,cat,gato,,1,"animal,animal",GAH-toh

and it's failing with same error. Working on that now, thanks.

jzohrab commented 1 week ago

Also added check for too many or few fields in the data, which the code wasn't handling correctly. :-P Whoops. Thanks.

jzohrab commented 1 week ago

Fixed in develop, will be in next launch.

jzohrab commented 1 week ago

Launched in 3.5.2