Closed jburel closed 1 year ago
0735974 is fully disabling Windows testing. Is it on purpose or is additional work planned to selectively exclude the backend OMERO.tables tests on Windows?
All CI tests are now successful and the OMERO.tables are skipped on Windows. I assume this will be assessed in a round to functional testing. Trying to identify a few scenarios which be reviewed in a deployment with this PR included:
omero config
Coming back briefly to the code changes and specifically https://github.com/ome/omero-py/pull/370/commits/a77749df58d1797f5ecc2af80694ea1078b6dc47, https://github.com/ome/openmicroscopy/pull/3431/ introduced a lot of logic specifically for handling OMERO.tables on Windows. Is it worth looking into reverting these changes as part of this clean-up to simplify our future maintenance? Or limit this to the minimal amount of changes and capture as an issue?
I would rather keep this PR as simple as possible i.e. remove dependency so we can use Python 3.8+ We can look at refactoring in a follow-up PR.
:+1: for the change in a post-Windows world but also for cleaning up the changes (in the future).
issue created https://github.com/ome/omero-py/issues/373
Now that CI is back @pwalczysko @will-moore could you go over the scenarios highlighted above i.e.
OK - I remember we had issues with async parallel reading of tables before - see https://github.com/ome/omero-web/pull/25 and that was "fixed" client-side by avoiding async requests. Was portalocker supposed to prevent that scenario? Is the replacement expected to do better at that than portalocker? Is that a good testing scenario for this PR?
We still use portalocker
but this time it is the upstream version.
I'm going to create a Table on merge-ci to test, but I'm not really sure what pass/fail looks like?
If portalocker doesn't work, do we expect to see corrupted files or some other clue?
If the existing behaviour - e.g. omero::InternalException
at https://forum.image.sc/t/server-error-after-populating-metadata-for-images-using-web-script/27844/14 is a "pass", what's a fail?
Testing with Josh's script at https://forum.image.sc/t/server-error-after-populating-metadata-for-images-using-web-script/27844/14 gives the same results on merge-ci and on latest-ci (but not sure that even tests this PR)?
As part of a general clean-up, it will be good to integrate this PR in the next release
@jburel - thinking about a release now, should we get this merged?
I think we can. any objections @joshmoore @sbesson?
Have all the scenarios mentioned in https://github.com/ome/omero-py/pull/370#issuecomment-1588774666 been validated? It's not obvious from the following reviews in this PR
It's not clear to me how to test concurrent access to OMERO config or OMERO.tables and what does a pass or fail look like?
We probably need a server without portalocker enabled and a clear idea of how to demonstrate something bad happening (file corruption?) if we want to be sure that this is prevented by portalocker.
It's not clear to me how to test concurrent access to OMERO config or OMERO.tables and what does a pass or fail look like?
For OMERO config, a minimal test is to call omero config edit
in a terminal. This should lock the XML file. In another terminal to make a call to read/write operation on the configuration file.
With the current OMERO.py release, I get
$ omero config edit
Could not acquire lock on /opt/omero/OMERO.current/etc/grid/config.xml
OMERO.tables will likely need a custom script testing something a similar scenario where a table is being locked and accessed at the same time by another process.
With this branch installed locally, this works - (running this in 2 terminals at the same time):
$ omero config edit
Could not acquire lock on /Users/wmoore/Desktop/OMERO.server-5.6.5-ice36-b233/etc/grid/config.xml
Going on the tests as a guide, tried this...
$ touch test_lock
$ python
>>> import omero.hdfstorageV2 as storage_module
>>> import threading
>>> lock = threading.RLock()
>>> HdfStorage = storage_module.HdfStorage
>>> hdf1 = HdfStorage("/Users/wmoore/Desktop/PY/omero-py/test_lock", lock)
>>> hdf2 = HdfStorage("/Users/wmoore/Desktop/PY/omero-py/test_lock", lock)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/wmoore/Desktop/PY/omero-py/target/omero/hdfstorageV2.py", line 203, in __init__
self.__hdf_file = HDFLIST.addOrThrow(file_path, self, read_only)
File "/Users/wmoore/Desktop/PY/omero-py/target/omero/util/decorators.py", line 94, in with_lock
return func(*args, **kwargs)
File "/Users/wmoore/Desktop/PY/omero-py/target/omero/hdfstorageV2.py", line 126, in addOrThrow
raise omero.LockTimeout(
omero.LockTimeout: exception ::omero::LockTimeout
{
serverStackTrace = None
serverExceptionClass = None
message = Path already in HdfList: /Users/wmoore/Desktop/PY/omero-py/test_lock
backOff = 0
seconds = 0
}
Also tried this, which didn't throw an Exception, but not sure if this should?
# conn.connect() etc...
r = conn.getSharedResources()
t = r.openTable(omero.model.OriginalFileI(2154), conn.SERVICE_OPTS)
t2 = r.openTable(omero.model.OriginalFileI(2154), conn.SERVICE_OPTS)
Also just realised that this is testing server-side code, whereas I only have this PR installed client-side.
Any more testing needed?
How do I connect to merge-ci server? merge-ci-devspace.openmicroscopy.org
?
$ omero login
Previously logged in to localhost:4064 as will
Server: [localhost:4064]merge-ci-devspace.openmicroscopy.org
Username: [will]user-3
Password:
InternalException: Failed to connect: Ice.ConnectionLostException:
recv() returned zero
Also just realised that this is testing server-side code, whereas I only have this PR installed client-side.
You're correct that the portalocker
usage in OMERO.tables is fully server-side and I have not found a reliable way to test it client-side.
After a quick investigation, the most relevant place I found to test this logic are the OMERO.tables unit tests which have been de-activated since the Python 3 migration in https://github.com/ome/omero-py/commit/b185956d45670eab3af80e6cd2e334dcfe8b3b65.
https://github.com/jburel/omero-py/compare/portalocker...sbesson:omero-py:table_locking_tests proposes 2 commits to re-activate these tests on top of this branch. d677199c0d0996211c568c69533de648f4c60723 in particular tests the omero.LockTimeout
exception raised by the portalocker
logic. Note that a number of adjustments had to be made to the TestHDFList.testLocking
which are quite possibly related to the upstream backwards incompatible changes in the Pytables 3.x line including https://www.pytables.org/MIGRATING_TO_3.x.html#api-name-changes and https://www.pytables.org/release-notes/RELEASE_NOTES_v3.1.x.html#backward-incompatible-changes
@sbesson do you want to push the commits to this branch?
Done. I believe the readthedocs CI build failure is unrelated
Done. I believe the readthedocs CI build failure is unrelated
It is I have opened a PR. readthedocs is now requiring a flag. see https://github.com/ome/omero-py/pull/383
I was trying to see where those unit tests are actually run, but I don't see anything explicit in the jobs above? Do we just assume they're running and passing silently?
https://github.com/sbesson/omero-py/commit/1ab20f75e7e546cc54cefc91aa64a1d883c6549e is one way to address this issue and use the verbose flag to list skipped/passing/failing tests individually. Should I push it to this branch?
@sbesson please push
Replace internal portalocker by the upstream version. This is the first step to be able to upgrade to Python 3.8+ (windows)