Open tschaffter opened 8 months ago
Added to Sprint 24.4
Moved to Backlog
Update: 05/13/2024 Challenges: N/A Remaining Tasks: Implement the load aspect of the ETL process so that the generated dataset is available/accessible in the MariaDB
Update: 05/15/2024 Challenges:
@tschaffter I have written code to connect to MariaDB using python. The OC_DB_URL within the .env file, jdbc:mysql://openchallenges-mariadb:3306/edam_etl, isn't used according to the documentation I have located. Resource1 and Resource2 for connecting to MariaDB.
I've received this error when using jdbc:mysql://openchallenges-mariadb:3306/edam_etl as the Host:
Error connecting to MariaDB Platform: Plugin jdbc:mysql could not be loaded: /usr/lib/x86_64-linux-gnu/libmariadb3/plugin/jdbc:mysql.so: cannot open shared object file: No such file or directory
Warning: command "poetry run python src/main.py" exited with non-zero status code
I get this error when I change the host to openchallenges-mariadb and don't use the OC_DB_URL in the .env file:
Error connecting to MariaDB Platform: Can't connect to local server through socket '/run/mysqld/mysqld.sock' (2)
Warning: command "poetry run python src/main.py" exited with non-zero status code
I'm wondering if the variables are being assigned incorrectly.
I believe that we have solve the issue since your last message. Feel free to get rid of the config variable OC_DB_URL
. We use it the OC microservice because the DB client we use in Java accept this URL as a parameter, which not be the case of the DB client for Python you use.
@tschaffter For PR #2680 It looks like the CI/pr (pull_request) check is failing because it can not find the installation of the MariaDB Connector/C required by Maria DB which doesn't support PEP builds. I was able to bypass this in the Dev container by running the indicated command in the terminal but I guess that doesn't transfer to the PR. It says it needs to be preinstalled but I'm unsure how that works w/in microservices. Should I be creating a script that will perform this operation w/in the app folder?
Using virtualenv: /workspaces/sage-monorepo/apps/openchallenges/edam-etl/.venv Installing dependencies from lock file
Package operations: 1 install, 0 updates, 0 removals
Installing mariadb (1.1.10)
ChefBuildError
Backend subprocess exited when trying to invoke get_requires_for_build_wheel
/bin/sh: 1: mariadb_config: not found Traceback (most recent call last): File "/etc/poetry/venv/lib/python3.10/site-packages/pyproject_hooks/_in_process/_in_process.py", line 353, in
main() File "/etc/poetry/venv/lib/python3.10/site-packages/pyproject_hooks/_in_process/_in_process.py", line 335, in main json_out['return_val'] = hook(**hook_input['kwargs']) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/etc/poetry/venv/lib/python3.10/site-packages/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel return hook(config_settings) ^^^^^^^^^^^^^^^^^^^^^ File "/tmp/tmppwky36mf/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel return self._get_build_requires(config_settings, requirements=['wheel']) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/tmppwky36mf/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires self.run_setup() File "/tmp/tmppwky36mf/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 487, in run_setup super().run_setup(setup_script=setup_script) File "/tmp/tmppwky36mf/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 311, in run_setup exec(code, locals()) File " ", line 27, in File "/tmp/tmp783xeam4/mariadb-1.1.10/mariadb_posix.py", line 62, in get_config cc_version = mariadb_config(config_prg, "cc_version") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/tmp783xeam4/mariadb-1.1.10/mariadb_posix.py", line 28, in mariadb_config raise EnvironmentError( OSError: mariadb_config not found. This error typically indicates that MariaDB Connector/C, a dependency which must be preinstalled, is not found. If MariaDB Connector/C is not installed, see installation instructions If MariaDB Connector/C is installed, either set the environment variable MARIADB_CONFIG or edit the configuration file 'site.cfg' to set the 'mariadb_config' option to the file location of the mariadb_config utility.
at /etc/poetry/venv/lib/python3.10/site-packages/poetry/installation/chef.py:164 in _prepare 160│ 161│ error = ChefBuildError("\n\n".join(message_parts)) 162│ 163│ if error is not None: → 164│ raise error from None 165│ 166│ return path 167│ 168│ def _prepare_sdist(self, archive: Path, destination: Path | None = None) -> Path:
Note: This error originates from the build backend, and is likely not a problem with poetry but with mariadb (1.1.10) not supporting PEP 517 builds. You can verify this by running 'pip wheel --no-cache-dir --use-pep517 "mariadb (==1.1.10)"'.
@mdsage1 Hint: Look at the files in the EDAM ETL project folder, in particular to project.json
. There is a perfect place somewhere where the pip command could be added.
What product(s) is this story for?
OpenChallenges
As a user, I want
No response
Description
Depends on #2547
Load the transformed EDAM data generated in #2547 into the OC Challenge Service DB.
Acceptance criteria
Running the following commands download, transform, and loads the EDAM data into the OC Challenge Service DB:
nx serve openchallenges-edam-etl
nx serve-detach openchallenges-edam-etl
Tasks
.env
Anything else?
See #2524 and its PR to get familiar with the environment of the project
openchallenges-edam-etl
.Have you linked this story to a GitHub Project?