Open mikegerber opened 2 years ago
* [ ] I also noticed an issue where I had an old checkout of `repo/assets` which would NOT trigger test fails but a fresh checkout would * [ ] `test/assets` are copied once and are then reused
@bertsky Reading https://github.com/OCR-D/ocrd_tesserocr/blob/master/Makefile#L124-L135, I think we should discuss how to handle {repo,test}/assets
better?
Reading https://github.com/OCR-D/ocrd_tesserocr/blob/master/Makefile#L124-L135, I think we should discuss how to handle
{repo,test}/assets
better?
Yes, I have always argued against that pattern – precisely because of old checkouts. The proper way to handle this is via subrepos, which can be updated. See for example repo/assets
in core.
test/assets
are copied once and are then reused
Yes, but the dependency to repo/assets
ensures that it is in sync with upstream. (The problem is the rule for the latter, see above.)
- We reuse the temporary workspace which is also potentially a problem
That's not true. Each test case is independent from the others – the fixture makes a fresh copy of test/assets
under /tmp.
- We reuse the temporary workspace which is also potentially a problem That's not true. Each test case is independent from the others – the fixture makes a fresh copy of
test/assets
under /tmp.
Yeah, I didn't remember writing it correctly :) My fixture code leaves the temporary directory though, and I think I'm going to rewrite it to use mktemp and only leaving it only if a fail occurs or the user wants it.
Tests now run on Python 3.6 to 3.10 on CircleCI (https://github.com/OCR-D/ocrd_calamari/commit/ef63291b5ada6cce173d505c566876bb9dd7010f, https://app.circleci.com/pipelines/github/OCR-D/ocrd_calamari/193/workflows/e099d41f-8eb7-4f8c-b605-4143a79ba52e).
We now install an older opencv-python-headless
using a wheel (pip install --prefer-binary opencv-headless
) for Python 3.6. This avoids the major pain of having to compile OpenCV for this version and so cuts down test time from >30 minutes to <5 minutes. The price paid for this is that we don't test with the latest OpenCV, but as Python 3.6 is EOL, I can live with it.
That's a nice recipe. Should adopt that in all the other places where we still use Python 3.6
* [ ] Why is the CircleCI result not accessible/private?
@kba Do you have an idea how to resolve this? I checked the project settings but couldn't find anything. To reproduce open a private browser window (= not logged in to CircleCI) and try:
😡 https://app.circleci.com/pipelines/github/OCR-D/ocrd_calamari gives 404 ✔ https://app.circleci.com/pipelines/github/OCR-D/core - for comparison - gives the build results
That's a nice recipe. Should adopt that in all the other places where we still use Python 3.6
Keep me posted if you adopt this. I currently use this in CircleCI config but could image moving it to the Makefile if that's the convention:
- run: if python3 --version | grep -q "Python 3.6"; then pip install --prefer-binary -U opencv-python-headless numpy; fi # to avoid compilation
I believe there would also be line in requirements.txt
that could do the same, but I didn't test this:
opencv-python-headless, opencv-python-headless == 4.6.0.66 ; python_version < "3.7"
But then I would really like if everyone would move away from Ubuntu 18.04, if possible.
That's a nice recipe. Should adopt that in all the other places where we still use Python 3.6
It gets pulled in by ocrd
so affects every processor - just had the same issue with dinglehopper (resolved with the same workaround).
Agreed, we should add this to core.
But then I would really like if everyone would move away from Ubuntu 18.04, if possible.
we are not Ubuntu 20 / Python 3.7 now in core/ocrd_all Docker BTW. (There's been no notification unfortunately.)
(There's been no notification unfortunately.
Notification in the chat coming once I get around to fixing the base image issue and do next release.
@kba Do you have an idea how to resolve this? I checked the project settings but couldn't find anything. To reproduce open a private browser window (= not logged in to CircleCI) and try:
😡 https://app.circleci.com/pipelines/github/OCR-D/ocrd_calamari gives 404 ✔ https://app.circleci.com/pipelines/github/OCR-D/core - for comparison - gives the build results
I've compared the project settings for both projects and could spot no difference. I'll try rotating the GitHub keys, perhaps there are stale permission settings linked to that.
Run scheduled tests (There have been subtle changes, e.g. in OCR-D, that changed the filenames of created files)
I set up a monthly schedule. (Needs to be done by "hand" unfortunately)
pip install --prefer-binary -U opencv-python-headless numpy
I'm going to add a line
if $(PYTHON) -V | fgrep -e 3.5 -e 3.6; then $(PIP) install --prefer-binary -U opencv-python-headless numpy; fi
to the recipe of make install
in core – so that it will end up in all sub-venv, Docker and CI targets.
@kba Is it ok to add this to https://github.com/OCR-D/core/pull/986?
@kba Is it ok to add this to OCR-D/core#986?
Yes, please. The opencv build time is enormous for fresh python 3.6- installations, great if we can reduce that.
repo/assets
which would NOT trigger test fails but a fresh checkout wouldtest/assets
are copied once and are then reusedpytest
work - currently onlymake test
works