release-engineering / pubtools-pulplib

A Pulp library for publishing tools
GNU General Public License v3.0
2 stars 24 forks source link

Add Client.create_repository() [RHELDST-22483] #218

Closed rbikar closed 8 months ago

rbikar commented 8 months ago

Client is now capable of creating a repository on pulp server. The repository is initialized with proper Importer in order to enable sync/upload of content to new repositories.

Another notable changes:

codecov[bot] commented 8 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 100.00%. Comparing base (dfe6dd8) to head (fd915a0).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #218 +/- ## ========================================= Coverage 100.00% 100.00% ========================================= Files 47 47 Lines 3162 3225 +63 ========================================= + Hits 3162 3225 +63 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

rbikar commented 8 months ago

I have addressed suggestions but for some reason the coverage checks started to time out.

rbikar commented 8 months ago

@rohanpm I've tried to play with the tests and timeout they hit, I was able to simulate it also on local env. but I'm still still unsure why the tests sometimes hang there and sometimes they do not. It seems to hang on waiter.acquire() I believe that that code with chaining futures in the create_repository() is OK and I think that there might be some problem in pytest that leads to deadlock. Do you have any idea why this happens?

...
  File "/home/runner/work/pubtools-pulplib/pubtools-pulplib/.tox/cov/lib/python3.8/site-packages/more_executors/_impl/map.py", line 72, in _delegate_resolved
    result = self._map_fn(result)
  File "/home/runner/work/pubtools-pulplib/pubtools-pulplib/pubtools/pulplib/_impl/client/client.py", line 1100, in get_and_check_repo
    repo_on_server.result() == repo
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/concurrent/futures/_base.py", line 439, in result
    self._condition.wait(timeout)
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/threading.py", line 302, in wait
    waiter.acquire()
rohanpm commented 8 months ago

@rohanpm I've tried to play with the tests and timeout they hit, I was able to simulate it also on local env. but I'm still still unsure why the tests sometimes hang there and sometimes they do not. It seems to hang on waiter.acquire() I believe that that code with chaining futures in the create_repository() is OK and I think that there might be some problem in pytest that leads to deadlock. Do you have any idea why this happens?

I believe there is a bug in the change actually. I was able to reproduce it too, but I couldn't reproduce it after a change like below:

diff --git a/pubtools/pulplib/_impl/client/client.py b/pubtools/pulplib/_impl/client/client.py
index 8a0bc27..5e24a85 100644
--- a/pubtools/pulplib/_impl/client/client.py
+++ b/pubtools/pulplib/_impl/client/client.py
@@ -1093,11 +1093,10 @@ class Client(object):

             raise exception

-        def get_and_check_repo(_):
-            repo_on_server = self.get_repository(repo_id)
+        def check_repo(repo_on_server):
             try:
                 assert (
-                    repo_on_server.result() == repo
+                    repo_on_server == repo
                 ), "Repo exists on server with unexpected values"
             except AssertionError:
                 if importer:
@@ -1120,7 +1119,7 @@ class Client(object):
                 )
                 raise

-            return repo_on_server
+            return f_return(repo_on_server)

         LOG.debug("Creating repository %s", repo_id)
         out = self._request_executor.submit(
@@ -1128,6 +1127,9 @@ class Client(object):
         )

         out = f_map(out, error_fn=log_existing_repo)
-        out = f_flat_map(out, get_and_check_repo)
+        # After supposedly creating the repo, get it...
+        out = f_flat_map(out, lambda _: self.get_repository(repo_id))
+        # ...and check it matches what's needed
+        out = f_flat_map(out, check_repo)

         return f_proxy(out)

...i.e. changing it into a fully non-blocking style rather than a mixture of blocking and non-blocking. Do you want to give that a try?

I think the problem with the other approach is:

It can happen randomly, or not, depending on how fast each future manages to complete. Because when you do out = f_flat_map(out, get_and_check_repo), if the future has already completed by that time, then get_and_check_repo will be immediately called in the current thread (no deadlock), but if it hasn't completed yet then get_and_check_repo will later be called from the RetryExecutor's thread (deadlock).