slub / ocrd_manager

frontend for ocrd_controller and adapter towards ocrd_kitodo
MIT License
11 stars 3 forks source link

Replaces flask with fastapi monitor #45

Closed SvenMarcus closed 1 year ago

SvenMarcus commented 1 year ago

Adds first draft for FastAPI ocrd monitor implementation. There are still several things that are still very WIP.

Settings

Until settings are finalized, I have hardcoded them in the monitor's init.sh. Eventually they should be parsed from existing variables or new variables that can be added to the main .env file. When reviewing, please check if I got all the paths to workspaces and so on correctly.

I intended to use pydantic's automatic settings management that is able to automatically parse environment variables or an env file into a settings object (see pydantic docs). Unfortunately it doesn't handle environment variables that are named exactly like sub settings classes very well. Therefore I had to prefix my new setting environment variables with OCRD_, since we already had variables starting with CONTROLLER.

Jobs page

I'm not sure I got all the links on the jobs page correctly. Especially, the links to the log files were a bit wonky in my testing, but could be because of all the changes I made during the implementation.

Testing

While I've written some tests for core logic, I'd like to expand tests for the web application a bit more in the future. This will probably involve setting up some dummy workspaces and mocking out the controller.

Documentation

Since the end user will probably only have to deal with setting up some new variables, I haven't added any documentation for that until we finalize all the settings we need.

markusweigelt commented 1 year ago

Here is the summary of the test on Monday. The initial condition was the test with the current version of the base repository and the Manager module on this branch.

bertsky commented 1 year ago

See https://github.com/SvenMarcus/ocrd_manager/pull/1

  • workspaces are not listet atm

should be fixed now

  • request of workflow from inactive jobs results in not found http://localhost:5000/workflows/detail/ocr-workflow-default.sh

The problem here is that the workflow files are only in ocrd_manager (e.g. /usr/bin/ocr-workflow-default.sh), not in ocrd_monitor.

Since we want to make all of these editable and perhaps even version-control them, IMO we have to introduce an additional volume, shared by both the Manager and the Monitor.

  • request of log from inactive jobs results in not found http://localhost:5000/logs/view/ocr-d/data/3

should be fixed now

  • linking of dozzle logs from home is missing

should be fixed now

  • maybe small documentation of how to run test in Readme.md
  • adjustment of enviroment variables in base repro -> separate PR maybe after current PR is merged
  • documentation of components in base documentation -> separate PR maybe after current PR is merged

I agree.

SvenMarcus commented 1 year ago

With the latest commits, I believe this PR should be ready to be merged now. While the code is not perfect yet, we should be able to further improve it from the main branch.

markusweigelt commented 1 year ago

Thx! šŸ‘šŸ»

Unfortunately I get a 404 when I try to open the default workflow because the path is assembled incorrectly http://localhost:5000/workflows/detail//usr/bin/ocr-workflow-default.sh

While the code is not perfect yet, we should be able to further improve it from the main branch.

Can you open a issue to see what specifically needs to be changed? This would also have to be taken into account in further implementation.

bertsky commented 1 year ago

Unfortunately I get a 404 when I try to open the default workflow because the path is assembled incorrectly http://localhost:5000/workflows/detail//usr/bin/ocr-workflow-default.sh

Like I said above, the reason for this is not in the Monitor (code), but in the way we currently set up the division of labour between containers. The workflows are only in the Manager. We'd first have to define a shared volume.

bertsky commented 1 year ago

Can you open a issue to see what specifically needs to be changed?

Here's one: https://github.com/slub/ocrd_manager/issues/46

bertsky commented 1 year ago

One thing that must be updated before merging though: the ports section of docker-compose.yml!

markusweigelt commented 1 year ago

Like I said above, the reason for this is not in the Monitor (code), but in the way we currently set up the division of labour between containers. The workflows are only in the Manager. We'd first have to define a shared volume.

Ok I overlooked this. I think the adjustment is divided into OCR-D Manager and Monitor Repo and it is one repo atm. How had we made the workflows visible in the monitor before?

bertsky commented 1 year ago

Ok I overlooked this. I think the adjustment is divided into OCR-D Manager and Monitor Repo and it is one repo atm.

That's independent, though. Containers/volumes are one thing, repos another.

How had we made the workflows visible in the monitor before?

We had a copy of the workflow in the process directory.

markusweigelt commented 1 year ago

Ok I overlooked this. I think the adjustment is divided into OCR-D Manager and Monitor Repo and it is one repo atm.

That's independent, though. Containers/volumes are one thing, repos another.

How had we made the workflows visible in the monitor before?

We had a copy of the workflow in the process directory.

Ah ok then it worked before because our demo data brought the workflow. By default, there is no workflow in the process dir, but only if it is configured in Kitodo.Production.

Then we fix that separately.

markusweigelt commented 1 year ago

How were we left with the Python test? Should a image or container be created with e.g. DockerfileTest here which contains the dependencies or should the depended code to Python 10 be adjusted?

In my case test do not run through yet.

weigelt@LDV163:~/Work/OCR-D/test/ocrd_kitodo/_modules/ocrd_manager/ocrd_monitor$ nox
nox > Running session mypy-3.9
nox > Missing interpreters will error by default on CI systems.
nox > Session mypy-3.9 skipped: Python interpreter 3.9 not found.
nox > Running session mypy-3.10
nox > Creating virtual environment (virtualenv) using python3.10 in .nox/mypy-3-10
nox > python -m pip install -r requirements.txt
nox > Command python -m pip install -r requirements.txt failed with exit code 1:
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/weigelt/Work/OCR-D/test/ocrd_kitodo/_modules/ocrd_manager/ocrd_monitor/.nox/mypy-3-10/lib/python3.10/site-packages/pip/__main__.py", line 19, in <module>
    sys.exit(_main())
  File "/home/weigelt/Work/OCR-D/test/ocrd_kitodo/_modules/ocrd_manager/ocrd_monitor/.nox/mypy-3-10/lib/python3.10/site-packages/pip/_internal/cli/main.py", line 73, in main
    command = create_command(cmd_name, isolated=("--isolated" in cmd_args))
  File "/home/weigelt/Work/OCR-D/test/ocrd_kitodo/_modules/ocrd_manager/ocrd_monitor/.nox/mypy-3-10/lib/python3.10/site-packages/pip/_internal/commands/__init__.py", line 96, in create_command
    module = importlib.import_module(module_path)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/weigelt/Work/OCR-D/test/ocrd_kitodo/_modules/ocrd_manager/ocrd_monitor/.nox/mypy-3-10/lib/python3.10/site-packages/pip/_internal/commands/install.py", line 24, in <module>
    from pip._internal.cli.req_command import RequirementCommand
  File "/home/weigelt/Work/OCR-D/test/ocrd_kitodo/_modules/ocrd_manager/ocrd_monitor/.nox/mypy-3-10/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 15, in <module>
    from pip._internal.index.package_finder import PackageFinder
  File "/home/weigelt/Work/OCR-D/test/ocrd_kitodo/_modules/ocrd_manager/ocrd_monitor/.nox/mypy-3-10/lib/python3.10/site-packages/pip/_internal/index/package_finder.py", line 21, in <module>
    from pip._internal.index.collector import parse_links
  File "/home/weigelt/Work/OCR-D/test/ocrd_kitodo/_modules/ocrd_manager/ocrd_monitor/.nox/mypy-3-10/lib/python3.10/site-packages/pip/_internal/index/collector.py", line 12, in <module>
    from pip._vendor import html5lib, requests
ImportError: cannot import name 'html5lib' from 'pip._vendor' (/home/weigelt/Work/OCR-D/test/ocrd_kitodo/_modules/ocrd_manager/ocrd_monitor/.nox/mypy-3-10/lib/python3.10/site-packages/pip/_vendor/__init__.py)

A small docu would be helpful here to perform the tests (https://github.com/slub/ocrd_manager/pull/45#issuecomment-1403451106) so that not everything is assumed or you have to orientate yourself on the source.

From my point of view, we can solve the runnability of the tests later, because initially the Flask application came even without tests.

SvenMarcus commented 1 year ago

How were we left with the Python test? Should a image or container be created with e.g. DockerfileTest here which contains the dependencies or should the depended code to Python 10 be adjusted?

In my case test do not run through yet.

I think this may be a problem with an outdated pip installation on your system: https://stackoverflow.com/questions/69503329/pip-is-not-working-for-python-3-10-on-ubuntu/69527217#69527217

The noxfile runs mypy and the tests for python versions 3.9, 3.10, and 3.11 and all of them are passing when I run it.

markusweigelt commented 1 year ago

How were we left with the Python test? Should a image or container be created with e.g. DockerfileTest here which contains the dependencies or should the depended code to Python 10 be adjusted? In my case test do not run through yet.

I think this may be a problem with an outdated pip installation on your system: https://stackoverflow.com/questions/69503329/pip-is-not-working-for-python-3-10-on-ubuntu/69527217#69527217

Unfortunately this did not helped. Have the latest version and before that the previous bugfix version.

The noxfile runs mypy and the tests for python versions 3.9, 3.10, and 3.11 and all of them are passing when I run it.

My question was also whether we should install all the requirements on the host (then a venv would be great) or whether we should just build an image temporarily and start it as a container.

markusweigelt commented 1 year ago

my locale side-packages directory has the html5lib. But nox opens virtual environments which contains an older pip version and in the .nox/.../side-packages/ the html5lib is not available.

Which python version an pip version do you use?

markusweigelt commented 1 year ago

As workaround i can run the test with following command docker run --rm -it -v $(pwd):/src thekevjames/nox:2022.11.21 nox -f src/noxfile.py however a few still fail. Probably it is because they are running in the container and not on the host.

markusweigelt commented 1 year ago

@SvenMarcus if you still make the discussed documentation adjustment we can merge the branch. :rocket: