OCR-D / ocrd_all

Master repository which includes most other OCR-D repositories as submodules
MIT License
71 stars 18 forks source link

qurator namespace pkg problems are back #433

Open bertsky opened 3 months ago

bertsky commented 3 months ago

Currently, with ocrd_neat included in ocrd_all, make check fails due to:

page2tsv --help
Traceback (most recent call last):
  File "/usr/local/bin/page2tsv", line 33, in <module>
    sys.exit(load_entry_point('qurator-tsvtools', 'console_scripts', 'page2tsv')())
  File "/usr/local/bin/page2tsv", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/usr/lib/python3.8/importlib/metadata.py", line 77, in load
    module = import_module(match.group('module'))
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/build/ocrd_neat/qurator/tsvtools/cli.py", line 19, in <module>
    from qurator.utils.tsv import read_tsv, write_tsv, extract_doc_links
ModuleNotFoundError: No module named 'qurator.utils'

Strangely though, qurator-sbb-utils are in fact properly installed:

$ pip show qurator-sbb-utils
Name: qurator-sbb-utils
Version: 0.0.1
Summary: Qurator
Home-page: https://github.com/qurator-spk/sbb_utils
Author: The Qurator Team
Author-email: Kai.Labusch@sbb.spk-berlin.de
License: Apache
Location: /usr/local/lib/python3.8/site-packages
Requires: click, ipython, numpy, pandas, requests, tqdm
Required-by: qurator_tsvtools
$ ls /usr/local/lib/python3.8/site-packages/qurator
__init__.py  __pycache__  utils
$ ls /usr/local/lib/python3.8/site-packages/qurator/utils/
__init__.py  __pycache__  csv.py  entities.py  ned.py  ner.py  parallel.py  pickle.py  qurator_data.py  tsv.py

However, qurator-sbb-utils does still use declare_namespace, which caused problems in combination with editable installations in the past.

@mikegerber can you please comment?

stweil commented 3 months ago

It works for me in a local build, but ocrd_neat is in DEFAULT_DISABLED_MODULES, so it is not built by default.

After make OCRD_MODULES=ocrd_neat all it still works.

bertsky commented 3 months ago

After make OCRD_MODULES=ocrd_neat all it still works.

You are probably not using editable mode, which is the default for docker builds.

Also, the effect only shows via make check CHECK_HELP=-h.

stweil commented 3 months ago

Indeed, I don't use the editable mode. make check CHECK_HELP=-h works fine in my installation.

Should the subject of this issue be updated to make clear that the problems are related to editable mode (and were never gone, so they also are not back)?

bertsky commented 3 months ago

As soon as all linked PRs are addressed, we can remove ocrd_neat from https://github.com/OCR-D/ocrd_all/blob/6cd9f7d92a71c359697ea4bd3d3edb11d1e0f340/Makefile#L72

BTW, what's the reason for this module name here, @kba? (Why not just page2tsv?)