Open vladak opened 1 year ago
Of course, the main trouble is to figure out what to do w.r.t. logging.
opengrok-mirror
is already parallelized, however at the project level. It might make sense to change this parallelism to repository level, i.e. assemble repositories of all projects and then submit them to the workers.
Observing a
opengrok-mirror
run for a bunch of non-local Mercurial repositories, with the-I
option, I noticed that each repository takes couple of seconds to check (viarepo.incoming()
). Since the list of repositories is known beforehand inutils/mirror.py#process_changes()
, this piece of code could be parallelized: https://github.com/oracle/opengrok/blob/c10182859ee0b2d541135b28e90df09aca1a13d7/tools/src/main/python/opengrok_tools/utils/mirror.py#L324-L330The top-level repo check needs some thought, though. Also, will need to take care of error reporting. There is no exception (re)thrown, however will need to make sure that errors are properly returned as
FAILURE_EXITVAL
.Lastly, will need to determine the parallelism level.
Similarly, same should be done for the repository synchronization part.