jonbiemond / BCIT-Flex

A website for easily navigating BCIT courses and offerings.
https://www.bcitflex.tech
MIT License
10 stars 8 forks source link

Error scraping a previously deleted record #90

Open jonbiemond opened 6 months ago

jonbiemond commented 6 months ago

Upon running the load-db command a duplicate key violation is raised by the database when the script tries to insert a record that is marked as deleted.

I believe this is happening because on merge SQLAlchemy does not see previously deleted records due to the soft delete hook. A work around may be pre-selecting objects into the identity map.

Trace:

(.venv) root@bcitflex:~# flask --app bcitflex load-db
/root/.venv/lib/python3.11/site-packages/bcitflex/scripts/scrape_and_load.py:400: SAWarning: Object of type <PrerequisiteOr> not in session, add operation along 'Course.prerequisites_of' will not proceed
  session.commit()
Traceback (most recent call last):
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
    self.dialect.do_execute(
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "pk_offering"
DETAIL:  Key (offering_id)=(964) already exists.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/.venv/bin/flask", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/flask/cli.py", line 1064, in main
    cli.main()
  File "/root/.venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/bcitflex/scripts/scrape_and_load.py", line 454, in load_db_command
    bcit_to_sql(db_url, all_subjects)
  File "/root/.venv/lib/python3.11/site-packages/bcitflex/scripts/scrape_and_load.py", line 441, in bcit_to_sql
    raise exc
  File "/root/.venv/lib/python3.11/site-packages/bcitflex/scripts/scrape_and_load.py", line 433, in bcit_to_sql
    count = load_courses(session, courses)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/bcitflex/scripts/scrape_and_load.py", line 400, in load_courses
    session.commit()
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1923, in commit
    trans.commit(_to_root=True)
  File "<string>", line 2, in commit
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1239, in commit
    self._prepare_impl()
  File "<string>", line 2, in _prepare_impl
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1214, in _prepare_impl
    self.session.flush()
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4179, in flush
    self._flush(objects)
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4314, in _flush
    with util.safe_reraise():
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 147, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4275, in _flush
    flush_context.execute()
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
    rec.execute(self)
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/unitofwork.py", line 642, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 93, in save_obj
    _emit_insert_statements(
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 1226, in _emit_insert_statements
    result = connection.execute(
             ^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1412, in execute
    return meth(
           ^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 515, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1635, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1844, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1984, in _exec_single_context
    self._handle_dbapi_exception(
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2339, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
    self.dialect.do_execute(
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "pk_offering"
DETAIL:  Key (offering_id)=(964) already exists.

[SQL: INSERT INTO offering (offering_id, crn, instructor, price, duration, status, course_id, term_id, created_at, updated_at, deleted_at) VALUES (%(offering_id)s, %(crn)s, %(instructor)s, %(price)s, %(duration)s, %(status)s, %(course_id)s, %(term_id)s, now(), now(), %(deleted_at)s) RETURNING offering.created_at, offering.updated_at]
[parameters: {'offering_id': 964, 'crn': '49028', 'instructor': 'Chris Harris', 'price': 509.83, 'duration': '12 weeks', 'status': 'In Progress', 'course_id': 27, 'term_id': '202410', 'deleted_at': None}]
(Background on this error at: https://sqlalche.me/e/20/gkpj)