Nelly-Barret / BETTER-fairificator

The fairification tools for BETTER project.
https://www.better-health-project.eu/
0 stars 0 forks source link

PyMongo is broken #24

Closed Nelly-Barret closed 3 weeks ago

Nelly-Barret commented 3 weeks ago

Since this afternoon I have this error:

/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/bin/python /Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/src/main.py --hospital_name=IT_BUZZI_UC1 --database_name=mytestd --metadata_filepath=datasets/metadata/IT-Buzzi-variables.csv --data_filepath=datasets/data/BUZZI/screening.csv --drop=False 
Console output is saving to: /Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/working-dir/better_database/log-2024-06-04.log
2024-06-05:16:04:01, DEBUG [main:<module>:36] False
2024-06-05:16:04:01, DEBUG [BetterConfig:write_to_file:245] ./working-dir/mytestd/properties.ini
2024-06-05:16:04:01, INFO  [main:<module>:81] Selected hospital name: IT_BUZZI_UC1
2024-06-05:16:04:01, INFO  [main:<module>:82] The database name is mytestd
2024-06-05:16:04:01, INFO  [main:<module>:83] The connection string is: mongodb://localhost:27017/
2024-06-05:16:04:01, INFO  [main:<module>:84] The database will be dropped: False
2024-06-05:16:04:01, INFO  [main:<module>:85] The metadata file is located at: ./working-dir/mytestd/metadata-IT_BUZZI_UC1.csv
2024-06-05:16:04:01, INFO  [main:<module>:86] The data file is located at: datasets/data/BUZZI/screening.csv
2024-06-05:16:04:01, DEBUG [Database:__init__:37] {'FILES': [{'working_dir': './working-dir'}, {'working_dir_current': './working-dir/mytestd'}, {'metadata_filepath': './working-dir/mytestd/metadata-IT_BUZZI_UC1.csv'}, {'data_filepath': 'datasets/data/BUZZI/screening.csv'}], 'DATABASE': [{'connection': 'mongodb://localhost:27017/'}, {'name': 'mytestd'}, {'drop': False}], 'HOSPITAL': [{'name': 'IT_BUZZI_UC1'}], 'SYSTEM': [{'python_version': '3.12.3 (v3.12.3:f6650f9ad7, Apr  9 2024, 08:18:48) [Clang 13.0.0 (clang-1300.0.29.30)]'}, {'pymongo_version': '4.7.2'}, {'mongodb_version': '7.0.8'}, {'execution_date': '2024-06-05 16:04:01.433349'}, {'platform': 'macOS-14.5-x86_64-i386-64bit'}, {'platform_version': 'Darwin Kernel Version 23.5.0: Wed May  1 20:09:52 PDT 2024; root:xnu-10063.121.3~5/RELEASE_X86_64'}, {'user': 'nelly'}]}
2024-06-05:16:04:01, DEBUG [BetterConfig:write_to_file:245] ./working-dir/mytestd/properties.ini
2024-06-05:16:04:01, DEBUG [Database:__init__:40] the connection string is: mongodb://localhost:27017/
2024-06-05:16:04:01, DEBUG [Database:__init__:41] the new MongoClient is: MongoClient(host=['127.0.0.1:27017'], document_class=dict, tz_aware=False, connect=True)
2024-06-05:16:04:01, DEBUG [Database:__init__:42] the database is: Database(MongoClient(host=['127.0.0.1:27017'], document_class=dict, tz_aware=False, connect=True), 'testtt')
2024-06-05:16:04:01, INFO  [Database:__init__:45] The MongoDB client could be set up properly.
Traceback (most recent call last):
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/src/main.py", line 88, in <module>
    etl = ETL(config=config)
          ^^^^^^^^^^^^^^^^^^
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/src/etl/ETL.py", line 15, in __init__
    self.database = Database(self.config)
                    ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/src/database/Database.py", line 52, in __init__
    appcoll.insert_one(document)
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/lib/python3.12/site-packages/pymongo/collection.py", line 658, in insert_one
    self._insert_one(
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/lib/python3.12/site-packages/pymongo/collection.py", line 598, in _insert_one
    self.__database.client._retryable_write(
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/lib/python3.12/site-packages/pymongo/mongo_client.py", line 1569, in _retryable_write
    return self._retry_with_session(retryable, func, s, bulk, operation, operation_id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/lib/python3.12/site-packages/pymongo/mongo_client.py", line 1455, in _retry_with_session
    return self._retry_internal(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/lib/python3.12/site-packages/pymongo/_csot.py", line 108, in csot_wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/lib/python3.12/site-packages/pymongo/mongo_client.py", line 1501, in _retry_internal
    ).run()
      ^^^^^
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/lib/python3.12/site-packages/pymongo/mongo_client.py", line 2347, in run
    return self._read() if self._is_read else self._write()
                                              ^^^^^^^^^^^^^
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/lib/python3.12/site-packages/pymongo/mongo_client.py", line 2450, in _write
    self._server = self._get_server()
                   ^^^^^^^^^^^^^^^^^^
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/lib/python3.12/site-packages/pymongo/mongo_client.py", line 2433, in _get_server
    return self._client._select_server(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nelly/Documents/boulot/postdoc-polimi/BETTER-fairificator/.venv-better-fairificator/lib/python3.12/site-packages/pymongo/mongo_client.py", line 1305, in _select_server
    session._transaction.drop()
    ^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: '_Transaction' object has no attribute 'drop'
MongoClient(host=['127.0.0.1:27017'], document_class=dict, tz_aware=False, connect=True)
Database(MongoClient(host=['127.0.0.1:27017'], document_class=dict, tz_aware=False, connect=True), 'testtt')

Process finished with exit code 1

Even when I go on the main branch, I cannot insert in any database. However, I can insert data from the Mongosh cmd-line within Compass.

In particular I do not see them in the list (show databases) which might be a symptom.

The error message may not be totally related to the error and/or not show the exact reason of the cause.

Nelly-Barret commented 3 weeks ago

I CAN insert from mongosh

Screenshot 2024-06-05 at 16 10 12
Nelly-Barret commented 3 weeks ago

But I CAN'T from pymongo!

(I did it from the Python cmd-line)

>>> from pymongo import MongoClient
>>> client = MongoClient()
>>> print(client)
MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True)
>>> db = client["mytest"]
>>> print(db)
Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'mytest')
>>> db["mytable"].insert_one({"a": 23})
Traceback (most recent call last):
  .....
AttributeError: '_Transaction' object has no attribute 'drop'

But I can do a find apparently. So this seems to be something related to the insert itself, not the database or the connection:

❯ python3
Python 3.12.3 (v3.12.3:f6650f9ad7, Apr  9 2024, 08:18:48) [Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pymongo import MongoClient
>>> client = MongoClient()
>>> print(client)
MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True)
>>> db = client["mytestdd"]
>>> print(db)
Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'mytestdd')
>>> db["mytable"].find()
<pymongo.cursor.Cursor object at 0x106c84ec0>
>>> db["mytable"].find({})
<pymongo.cursor.Cursor object at 0x106c85040>
>>> db["mytable"].find({"de": "z3"})
<pymongo.cursor.Cursor object at 0x106c85070>

But I cannot iterate over it:

>>> for my_element in db["mytable"].find({"de": "z3"}):
...   print(my_element)
Traceback (most recent call last):
  .....
AttributeError: '_Transaction' object has no attribute 'drop'
Nelly-Barret commented 3 weeks ago

I have fixed the problem 🥳

What I did:

1. Remove any mongo installation and local configs

2. Remove Mongo data directories

3. Kill any instance that could still run

sudo lsof -iTCP -sTCP:LISTEN | grep mongo shows PIDs of running Mongo processes --> kill them

4. Re-install MongoDB

5. Create a new MongoDB directory

For instance in my documents (not under /usr or such, where we can have permission issues)

6. Start MongoDB by hand in one terminal with the new data directory

mongod -dbpath /Users/nelly/Documents/mongodb-data

Nelly-Barret commented 3 weeks ago

The above fix is not really a fix because I can isnert data with a Python3 cmd line, but not with my venv activated. I will recreate my venv and reinstall the dependencies to see if this solved the problem

Nelly-Barret commented 3 weeks ago

Yes, this solves the problem. Why? Because yesterday I renamed my method reset() into drop() and this has also renamed in the libraries Python files...... 👹

Nelly-Barret commented 3 weeks ago

I marked the .venv folder as excluded in PyCharm to avoid this happens again