slub / ocrd_kitodo

Docker integration of Kitodo.Production and OCR-D
MIT License
10 stars 6 forks source link

Mkdocs material documentation #58

Closed markusweigelt closed 1 year ago

markusweigelt commented 1 year ago

PR adds the mkdocs documentation of integration of OCR-D and Kitodo.

Besides the files of the documentation a script for converting .env file to context specific tables as md's and a workflow to build the documentation using mkdocs and the material theme and deploy to GitHub Pages is included.

https://slub.github.io/ocrd_kitodo/

markusweigelt commented 1 year ago

And I still find your Configure an existing... subsections in the Enable/Disable Modules section confusing. I therefore propose subsuming under Compose profiles and merely mentioning all the steps for external instances as a note paragraph (as in the Readme.md).

Yes i know what you mean. We can optimize structure here maybe moving this to Configure External. The Readme.md version doesn't work for me, cause you get informations about not using the sepectific profile as edge note at the section for using profile. So i think it is hard to find even if you deal with the documentary.

bertsky commented 1 year ago

Yes i know what you mean. We can optimize structure here maybe moving this to Configure External.

I think it would be ok in the form resulting from my suggestions. The parts which summarise and preempt Configure External are quite short, and they both link to it for details.

The Readme.md version doesn't work for me, cause you get informations about not using the sepectific profile as edge note at the section for using profile. So i think it is hard to find even if you deal with the documentary.

I don't understand. (In case you misread my comment as suggesting to link from mkdocs to readme that's not what I meant. I was merely comparing to the current content there.)

markusweigelt commented 1 year ago

I don't understand. (In case you misread my comment as suggesting to link from mkdocs to readme that's not what I meant. I was merely comparing to the current content there.)

No i didn't mean to link to the Readme.md. I just mean the structure/content you want to go back to like in the Readme.md. The content of the Readme.md with link to mkdocs will be changed later.

bertsky commented 1 year ago

ok:

The Readme.md version doesn't work for me, cause you get informations about not using the sepectific profile as edge note at the section for using profile. So i think it is hard to find even if you deal with the documentary.

How so? It is the only subsection about this module within the overall enable/disable section. So naturally it covers the "disable" part – as a note on what you might want to do to still get the same functionality in another way.

What's important for orientation is that we still distinguish between the perspective of ocrd_kitodo (what you have to do to connect some external services) and that of the external modules (what you have to do to install/run them independently and then can connect to). Even if there is a bit of redundancy here, we'd still have each side where it is expected.

markusweigelt commented 1 year ago

I added a space between # and it works for me. I can run containers and process ocr.

In the OCR-D Manager i get following exception while filetransform processing. I think that the adjustment of env file is not the reason for that problem, cause docker inspect gets correct value and echo ${CONTROLLER} results to ocrd-controller:22. So something may be wrong with scope of CONTROLLERPORT and CONTROLLERHOST variable in ocrd_lib.sh

today at 18:24:34Nov 24 17:24:33 ocrd-manager for_production.sh: 17:24:33.094 INFO ocrd.task_sequence.run_tasks - Start processing CLI task 'fileformat-transform -I OCR-D-OCR -O FULLTEXT -p '{"from-to": "page alto", "script-args": "--no-check-border --dummy-word", "ext": ""}''
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh: Traceback (most recent call last):
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:   File "/usr/local/bin/ocrd", line 33, in <module>
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:     sys.exit(load_entry_point('ocrd', 'console_scripts', 'ocrd')())
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1128, in __call__
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:     return self.main(*args, **kwargs)
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1053, in main
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:     rv = self.invoke(ctx)
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1659, in invoke
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:     return _process_result(sub_ctx.command.invoke(sub_ctx))
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1659, in invoke
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:     return _process_result(sub_ctx.command.invoke(sub_ctx))
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1395, in invoke
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:     return ctx.invoke(self.callback, **ctx.params)
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 754, in invoke
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:     return __callback(*args, **kwargs)
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:   File "/build/core/ocrd/ocrd/cli/bashlib.py", line 115, in bashlib_input_files
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:     for input_file in processor.input_files:
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:   File "/build/core/ocrd/ocrd/processor/base.py", line 249, in input_files
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:     ret = self.zip_input_files(mimetype=None, on_error='abort')
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:   File "/build/core/ocrd/ocrd/processor/base.py", line 338, in zip_input_files
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh:     file_.pageId, ifg))
today at 18:24:47Nov 24 17:24:46 ocrd-manager for_production.sh: ValueError: Multiple PAGE-XML matches for page 'p0001' in fileGrp 'OCR-D-OCR'.
today at 18:24:48Nov 24 17:24:48 ocrd-manager for_production.sh: cat: '/tmp/tmp.Gf7wT4XPC6/*': No such file or directory
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh: 17:24:49.321 INFO ocrd.workspace.save_mets - Saving mets '/data/KitodoJob_89_3/mets.xml'
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh: Traceback (most recent call last):
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:   File "/usr/local/bin/ocrd", line 33, in <module>
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:     sys.exit(load_entry_point('ocrd', 'console_scripts', 'ocrd')())
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1128, in __call__
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:     return self.main(*args, **kwargs)
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1053, in main
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:     rv = self.invoke(ctx)
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1659, in invoke
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:     return _process_result(sub_ctx.command.invoke(sub_ctx))
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1395, in invoke
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:     return ctx.invoke(self.callback, **ctx.params)
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:   File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 754, in invoke
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:     return __callback(*args, **kwargs)
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:   File "/build/core/ocrd/ocrd/cli/process.py", line 32, in process_cli
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:     run_tasks(mets, log_level, page_id, tasks, overwrite)
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:   File "/build/core/ocrd/ocrd/task_sequence.py", line 244, in run_tasks
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh:     raise Exception("%s exited with non-zero return value %s." % (task.executable, returncode))
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh: Exception: ocrd-fileformat-transform exited with non-zero return value 1.
today at 18:24:50Nov 24 17:24:49 ocrd-manager for_production.sh: terminating with error $?=1 from ssh -T -p "${CONTROLLERPORT}" ocrd@${CONTROLLERHOST} 2>&1 on line 111 
bertsky commented 1 year ago

I added a space between # and it works for me. I can run containers and process ocr.

good!

In the OCR-D Manager i get following exception while filetransform processing. I think that the adjustment of env file is not the reason for that problem, cause docker inspect gets correct value and echo ${CONTROLLER} results to ocrd-controller:22.

yes, think so too

So something may be wrong with scope of CONTROLLERPORT and CONTROLLERHOST variable in ocrd_lib.sh

Not likely. These have always worked. Also the error happens late in the game. Appears like your workspace (on the Manager's volume) already existed and could not be properly overwritten Is this from the current ocrd_controller version (which should be based on ocrd_all with the --overwrite bug fixed already)?

markusweigelt commented 1 year ago

Not likely. These have always worked. Also the error happens late in the game. Appears like your workspace (on the Manager's volume) already existed and could not be properly overwritten Is this from the current ocrd_controller version (which should be based on ocrd_all with the --overwrite bug fixed already)?

Sorry (it is already very late) i did not switch to mkdocs-material branch, so it is a problem with linked submodule in the main branch. I created a fresh directory so there should not be a problem with old data.

markusweigelt commented 1 year ago

The mkdocs-material state with changed .env runs through with me.

markusweigelt commented 1 year ago

ok:

The Readme.md version doesn't work for me, cause you get informations about not using the sepectific profile as edge note at the section for using profile. So i think it is hard to find even if you deal with the documentary.

How so? It is the only subsection about this module within the overall enable/disable section. So naturally it covers the "disable" part – as a note on what you might want to do to still get the same functionality in another way.

Me as a pragmatic reader with an existing Kitodo installation, I would simply ignore the with-kitodo-production section in the first step as I would not expect any information on how to disable it. I think it is harder to find and therefore I am not a fan of the Readme.md/suggest variant. But I'll just merge it and if there is any feedback, then we can adjust it.

What's important for orientation is that we still distinguish between the perspective of ocrd_kitodo (what you have to do to connect some external services) and that of the external modules (what you have to do to install/run them independently and then can connect to). Even if there is a bit of redundancy here, we'd still have each side where it is expected.

Yes ocrd_kitodo should be more an overlay perspective to combine modules. Atm in Configure Externals it is I think insufficiently differentiated. For example the OCR-Controller doc contains combined informations of installation and connecting from ocrd_kitodo. Maybe it is better to change headline weight and text e.g. Configure External -> OCR-D Controller -> Configuring an external instance and Configure External -> OCR-D Controller -> Connect from ocrd_kitodo.

bertsky commented 1 year ago

Ok, you're right. Let's improve further by

Would you like me to make suggestions here, or as a new PR?

markusweigelt commented 1 year ago

Would you like me to make suggestions here, or as a new PR?

Sure you are welcome to deposit a suggestion. So another PR is not needed.