openml / openml.org

New OpenML website
https://new.openml.org
BSD 3-Clause "New" or "Revised" License
24 stars 18 forks source link

Updated python dependencies & configurable docker builds #325

Closed josvandervelde closed 6 months ago

josvandervelde commented 6 months ago

Goal

Running different versions of the OpenML website: one in production (k8s), one in test. To make that possible, all urls should be configurable, both in the Python backend as in the React frontend.

Changes

In this PR (sorry, I could've splitted it in separate PRs): 1) Updated python requirements. This needed to be done anyway, and I encountered some errors with the old requirements.txt (for example, the newer pandas does not allow df.append, while no version of pandas was specified). Also, many dependencies in requirements.txt were dependencies of dependencies - I removed them. 2) Updated python code, reflecting the updated requirements, avoiding deprecated or removed functionality

  • pandas: df.append -> pd.concat
  • dash: renamed imports
  • flask_jwt_extended: @jwt_required -> @jwt_required() and rename of blacklist to blocklist 3) Updated the environment variables: moved from .flaskenv to .env (easy to use for both react and flask), renamed some and added some (e.g., DO_SEND_EMAIL) 4) Updated python & React code so that all URLs are configurable using env vars 5) Updated React so that it's compatible with ElasticSearch version 6 (current prod), 7 and 8 (new prod), using env var REACT_APP_ELASTICSEARCH_VERSION_MAYOR 6) Updated docker

Similar to PR 274

This PR is duplicate with most of the work of PR https://github.com/openml/openml.org/pull/274/. I took the liberty to take inspiration from that PR, but not use it directly, because it's old. We should probably close that PR if we merge this - although there are some additions in that PR that I did not include here:

How to test

The python unittests seemed to not be up-to-date. They didn't seem to be easily runnable, and the coverage seemed small. I ignored them (let me know if they should be up-to-date). I tested this by running it locally (the .env variant) and deploying it to https://k8s.openml.org (the .env.k8s variant), and: 1) Listing datasets, tasks etc. 2) Clicking on a dataset/task etc. 3) Downloading dataset + json + xml + croissant 4) Running analysis of dataset/task

Current deployment

Docker image openml/frontend:k8s (see https://hub.docker.com/r/openml/frontend/tags) runs on https://k8s.openml.org/. Docker image openml/frontent:latest is currently not used (it could be used for the current production). I didn't create a test image yet, I'll do that in a separate PR.